Duke University en, 


JUL 1.0 4949 


MATHEMATICAL 
JOURNAL 


EDITED BY 
LEONARD CARLITZ DAVID VERNON WIDDER 


JOSEPH MILLER THOMAS 
Managing Editor 


: WITH THR COOPERATION OF 

) R. P. BOAS, JR. J. W. GREEN W.T.MARTIN - Rk. J. WALKER 

| H.8.M.COXETER G.A. HEDLUND F. J. MURRAY MORGAN WARD 

» J. L. DOOB N. LEVINSON GORDON PALL HASSLER WHITNEY 

J.J. GERGEN E, J. MCSHANE J.H.ROBERTS G.T.WHYBURN 
C,C.MacDUFFEE J. W. TUKEY 


Volume 9, Number 2 
JUNE, 1942 








DUKE MATHEMATICAL JOURNAL 


This periodical is published quarterly under the auspices of Duke Universifj 
by Duke University Press at Durham, North Carolina. It is printed at M 
Royal and Guilford Avenues, Baltimore, Maryland, by the Waverly Press. 

Entered as second class matter at the Post Office, Durham, North Caroling 
Additional entry at the Post Office, Baltimore, Maryland. 1 

The subscription price for the current year is four dollars, postpaid; bael 
volumes, five dollars each, carriage extra. Subscriptions, orders for back num 
bers, and notice of change of address should be sent to Duke University Pres 
Durham, North Carolina. : 

Individual and institutional members of the Mathematical Association. ¢ 
Ameri¢a may subscribe to the current volume at half price. To get the reduce 
price, orders for subscriptions must bear the mention “Member MAA.” If a 
order at the reduced price is placed through an agent, the purchaser must pa 
any commission charge incurred. 4 

Since 1935 the Mathematical Association of America has given th 
Duke Mathematical Journal an annual subsidy, in return for whidl 
the half-rate has been allowed. Having served its purpose of aidin 
the establishment of the Journal, the subsidy is to be discontinued @ 
the end of 1942, In view of the help already received from the ; 
ciation, however, Duke University Press will for at least five years cor 
tinue to allow the half-rate to any one who in 1942 is a subscriber ¢ 
the reduced rate provided his subscription to the Journal and mem 
bership in the Association remain unbroken. This arrangement is € 
pected to be permanent, but the Press reserves the right to modify « 
withdraw it after the five years and to change the basic rate at 
beginning of any calendar year. 

Manuscripts and editorial correspondence should be addressed to Duk 
Mathematical Journal, 4785 Duke Station, Durham, North Carolina. A 
Author’s Manual containing detailed information about the preparation 4 
papers for publication will be sent on request. é 

Authors are entitled to one hundred free reprints. Additional copies will bt 
supplied at cost. All reprints will be furnished with covers unless the contrar 


is specifically requested. 


The American Mathematical Society is officially represented on the Editoria 
Board by Professors Murray and Ward. : 


Made in United States of America 


WAVERLY PRESS, INC. 
BALTIMORE U.8. A: 











THE ASYMPTOTIC FORMS OF THE SOLUTIONS OF AN ORDINARY 
LINEAR MATRIC DIFFERENTIAL EQUATION 
IN THE COMPLEX DOMAIN 


By Homer E. NEWELL, JR. 
1. Introduction. The matric differential equation 
{A(Gsj73(2)) + (qis(x, A))} ¥ (a, A), 


under conditions to be given below, has solutions of the form P(x, \)E(z, dA), 
where E(x, \) = (6;; exp {dX f* r;(x) dx}) and P(z, d), analytic in x, reduces uni- 
formly in x to the identity matrix when \ becomes infinite. 

The present discussion rests directly upon, and extends, theory recently pub- 
lished by R. E. Langer’ who showed that if the coefficient functions rj(x) are all 
analytic and bounded in a region of the complex plane and their differences 
r(x) — r;(x) (¢ ¥ J) are all bounded from zero and if, moreover, the functions 
qi(2, X) are analytic in x, bounded in z and X and, for | \ | large, admit either 
actual or asymptotic expansions in \ with coefficients analytic and bounded 
in z, then a solution of the form P(x, \)E(2x, \) exists for the above differential 
equation in the neighborhood of any specified point of the given x region. This 
paper, for the most part, deals with equations in which the coefficient functions 
r(x) may have poles and the differences r;(x) — r;(x) (¢ # j) may have zeros 
on the boundary of the x region in question. Regions of existence which abut 
such a pole or zero are established fora solution of the stated form P(x, \)E(2, X). 

In addition, it is shown that the restriction to finite x regions, assumed in 


Langer’s discussion, may be removed. 


2. The matric equation. Throughout the considerations to follow, the differ- 
ential equation’ 


4 yz, r) = {rR(x) + Q(z, »}¥(z, d), 


(2.1) = 


Received January 3, 1941. 

1R. E. Langer, The boundary problem of an ordinary linear differential system in the com- 
plex domain, Trans. of the Am. Math. Soc., vol. 46(1939), pp. 151-162. 

The author wishes to express his thanks to Professor Langer for most valuable sugges- 
tions received during the preparation of the present paper. 

2 In the notation adopted here, italic capitals without subscripts denote square matrices 
of order n. The operations of differentiation and integration are applied in accordance 


with the relations 


d 1 z z 
— (y(z)) = = yi(z) ), / (ysj(x)) dx = / Yyij(x) dx 
dz dz 


in which the right members serve to define the left. Also, a matrix is said to be analytic 
if each element is analytic. 
245 











246 HOMER E. NEWELL, JR. 


where’ R(x) = (8,r;(xz)) and Q(x, A) = (qi;(z, )), will be assumed to satisfy the 
general conditions given herewith. In the complex plane the variables x and \ 
are to be permitted to range over suitable regions, the existence of which is pre- 
supposed as a basic hypothesis. The precise meaning of the term suitable, as 
used here, is contained in the following definition. 


A closed region in the x plane consisting of a sector of a circle together with its 
boundary and a region in the \ plane are said to be suitable to the matric differential 
equation (2.1) if for x and d in them: 

(a) |A| is unbounded; 

(b) 2» denoting the vertex of the circular sector in the x plane, the coefficient func- 
tions r(x) (j = 1, «++ , n) are analytic except possibly for poles at xp ; 

(ce) the differences r(x) — r(x) (t, 7 = 1,--- ,n; % # J) have zeros, if any, 
only at x ; and 

(d) the functions qi;(x, d) (i, 9 = 1,--- ,) are analytic in x except possibly 
for poles at x» , are bounded in x, and, for | d | large, admit either actual or asymp- 
totic representations such that 


(2.2) g(x, ) ~ DN’ gS? (a) (i,j = 1,---,n), 


ves) 
where the q(x) depend only on x, and where 
(2.3) qi} (xz) = 0 (j =1,---,n). 


Also, it will be assumed hereinafter that if in the suitable region in question 
the r;(x) are bounded, this will imply the existence of at least one index pair 
(i, j) (¢ ¥ j) for which r,(x) — r;(x) has a zero at the point 2 on the boundary. 
This will avoid a repetition here of the earlier results cited above in the intro- 
duction. 


3. “Associated” and “fundamental” regions. The concepts of ‘‘associated” 
and “fundamental” regions are due to Langer, who introduced them in the 
paper referred to. The details of the discussion involved in their definition are 


repeated here. 
In a region of the z plane suitable to (2.1), the relations 


d SS ° ’ == ee 
(3.1) 7y Pile) = r;(z) (j = 1, , n) 


can be satisfied by a set of functions R;(x) which are analytic except possibly 
for poles and logarithmic infinities on the boundary. Suppose such a set of 
functions has been chosen. The transformations 


(3.2) t” = rx{ R(x) — R,(z)} (i,j = 1,---,n;t # J) 
may then be defined, \ varying within a suitable \ region; and for each \ the 


* The symbol 4; is the Kronecker delta. 











MATRIC DIFFERENTIAL EQUATIONS 247 


relations (3.2) corresponding to a given index pair (7, 7) will map any closed 


subregion’ X of the suitable x region onto a closed region ="’ of the complex 
£’’ plane. 

Let \ be fixed for the moment, and consider the possibility that such a sub- 
region X contains a set of points x¥ (i, 7 = 1, --- ,n; 7 ¥ j), not necessarily 


distinct, such that each of the points & obtained by the relations 


(3.3) # = (R(x) — R,(x¥)} (i,j =1,---,nj3t ¥ J) 


met 


can be connected with every point of the respective region ="? by a path which 
lies wholly in =” and along which, in passing from £¥’ to the point &;; in question, 
the abscissa is non-increasing. The existence of such a set of points plainly 
depends upon the shape of X. 

If \ now varies, the shapes of the ="’ do not change, as is clear from (3.2). 
Variations in | \ | alone merely alter the size of any =" and cannot, therefore, 
affect either the existence or the position of a point x3’. But a change in arg \ 
causes the regions =" to undergo rotations in their respective planes and an 
existing set of points r¥ may lose its characteristic properties thereby. This 
does not necessarily happen, however, and it will be shown later that such a 
set of points x4’ may exist independently of \, retaining its properties for all 
variations in arg \ over some definite range. In reference to this possibility, 
the following terminology will be used. 

A closed subregion X of a suitable x region and a subregion A of a suitable 
region will be termed “associated” regions if there exists in X a set of points x¥! 
(i,j = 1,--: ,n;t # j) not necessarily distinct, but fixed as to X, having the prop- 
erties described above, and retaining them for all in the region A.” 

It may happen that for a given suitable \ region no associated region exists 
in the x plane. However, the possibility that there are x regions which may be 
associated with regions completely covering’ the suitable \ region is not ex- 
cluded thereby, and such a set of x regions may have a part in common. A 
definition follows. 

A closed region of the x plane will be designated as a fundamental region relative 
to a given suitable \ region if it is included in each of a finite number of regions X 
which are associated with regions A completely covering the suitable region in 
question.’ 

4. The existence of associated and fundamental regions. As in §2, denote 
by a the vertex of a circular sector suitable to (2.1). Then the following 
theorem can be proved. 


‘ “Closed subregion’’, i.e., a subregion together with its boundary. 

5 Op. cit., p. 156. 

6 It is admissible that the points \ for which arg \ possesses one of finitely many specific 
values be regarded as ‘‘covered”’ by virtue of being a boundary point of a closed associated 
region A. 

7 Op. cit., p. 156. 








248 HOMER E. NEWELL, JR. 


THeoreM 1.” If every r(x) — rj(x) (i # j) having at x a pole of order 1 is 
of the form a(x — x)’, where a"’(# 0) is a constant, and if \ is an arbitrarily 
chosen point within a suitable X region, there exist associated regions containing 
% and dX, and there exist regions containing x which are fundamental relative to 
the suitable X region. 


Let a be the smallest positive integer for which the functions 


(x — %)*{ri(x) — 7;(x)}, (x — a)*{r(x) — rj(x)}™* 


(4.1) 

(G,j =1,---, nt #)j) 
are all bounded as x — x. Denote by = any sector of which that part in 
which | x — 2 | is sufficiently small lies in the given suitable x region, and which 


is of the form 


(4.2) 6, S arg (x — %) S Bo, 

where 

(4.3) 0 < po — B < wr/(a + 1)(v + 1) (v = n(n — 1)/2). 
Finally, let 8)’ (uv = 1, 2;7,7 = 1, --- ,n;% # j) be the limiting value of 


arg {ri(x) — rj(x)} + arg r 


as © — 2 along arg (x — 2%) = B,. 
As may be seen from (4.3) and the definition of a, the following relations hold: 


(4.4) 0 s | Bo’ — Bi’ | < ar/(a + 1)(v + 1) (Q,j = 1,-++,n3t ¥ J). 


If y,’ (wu = 1, 2;%,7 = 1, +++ ,m;% ¥ J) are any real numbers which satisfy the 


conditions 

(4.5a) vi’ < By <y2’ (u = 1, 2), 
(4.5b) m2 — vi’ < ax/(a + 1)(v + 1), 

(4.5c) y,’ = 7 (mod 7), (u = 1, 2),” 


then there exists a positive number 6 such that in that part of = in which 0 < 
|x — a| S 4, the following inequalities hold: 


(4.6) vi’ < arg {r(x) — rj(x)} + arg > < 72’ (Q,j7 =1,--+,njt #7). 
Let @ be a number” satisfying all of the inequalities 
(4.7) (B+ 72) —- (itr) <O0<xr/(¥+1) (i,j 


and let + be a number satisfying the inequality 


1, +++ ,n;t x 9), 


8 Op. cit., §4. 

* This is consistent with the two preceding conditions, since from the relations 
{rx(x) — r;(x)} = —{r;(x) — r;(z)}, it follows that ati = gi? (mod 7), (u = 1, 2). 

10 That such a number exists is clear from the relations (4.3) and (4.5b). 








MATRIC DIFFERENTIAL EQUATIONS 249 


(4.8) Ai+kéS7r Sh + ke, 

where k is some integer from the set (0, 1,---,¥v). Then, setting 

(4.9) r= 7 + arg {ri(x) — r(x)} + arg ro, 

the inequalities 

(4.10) Bi + yi? + kO < 7? < Be + 72’ + ko (i,j = 1,-++,n;¢ #9) 


are satisfied. 

For at least one of the admitted values of k, for all index pairs (7, 7) (¢ ¥ J), 
r’’, satisfying (4.10), is bounded (mod x) from 3. For, if this were not so, 
then, since there are v + 1 values for k and, for each k, only v essentially distinct 
inequalities (4.10) to consider,” for at least one index pair (7,7), there would be 
two distinct values k, and k2 for which r"’ satisfying the corresponding inequality 
(4.10) would not be bounded from $x (mod z). But, from the left of (4.7) it 
would follow that 


IV 


(ke + 1)@ — ki6 
where k, is taken to be the smaller of k; and k,. Since k2e + 1—-kh S » +1, 
this yields 


7, 


(v + 1)6=z2, 


which contradicts the right hand inequality of (4.7). 

Now in (4.8) take k as that value (or one of the values) for which 7”, for all 
pairs (7, 7) (¢ * j), is bounded from 42 (mod 7) and observe that if + represents 
the inclination of a curve in = (with 0 < |x — a| S 4), then, for each index 
pair (7, j) (¢ ¥ j), 7’ is the inclination of the corresponding curve in the respec- 
tive £"’ plane. 

If k = 0, consider a parallelogram X in 2, with x) as one vertex, with two 
sides along the inclinations 6; and 82 , and such that |x — a | S 6inX. The 
boundary of such a region has an inclination 7 satisfying the inequality 


AaSrsh; 
Ae 


hence the inclinations of the boundaries of the regions =’ onto which X is 
mapped by the relations (3.2) (with X = Xo) are bounded (mod 7) from 3x. In 
addition, the boundary of every such =” is cut in at most two points by each 
vertical line {¢'7} = constant in the respective é” plane, as is evident from the 
shape of X and the character of the map which is conformal except at the point 
2 , the possibility of a reentrant angle at £'’(zo) being excluded by (4.3). Hence, 
there exists in X a set of points x’ which are independent of \ in some region A 
containing ») and included in the given suitable \ region, i.e., X and A form a 
pair of associated regions. 


1 Cf. (4.5¢). The inequality (4.10) corresponding to the index pair (7, j) is the same 
(mod x) as that corresponding to the pair (j, 7). 








250 HOMER E. NEWELL, JR. 


If k # 0, there are two mutually exclusive cases. Either (1) all 8, + 8)? 
(u = 1,2;7,7 = 1, --- ,n;% # J) are not 3x (mod =) or (2) some 8, + B3? are 
4a (mod z). 

In the first case, there is a positive number 6; S 6 such that the curves in the 
various £"’ planes which correspond under (3.2) (with A = Xo) to the parts of the 
rays bounding = on which |x — x | S 4 have inclinations bounded (mod z) 
from 3x. If the rays bounding > be cut at positive distances less than 6,from 
xo by a straight line of inclination 7 satisfying (4.8), the triangle X, with vertex 
at 2», so formed in Z,can be associated (as in the case of the parallelogram 
above) with a region A containing Xo . 

In the second case, let it be recalled that each r;(xz) — r(x) (¢ ¥ j) is of 


the form 


ij ii ii; 2 
(x — a)* fay’? + ay’(x — x) +--- } (a)’ ¥ 0), 
where the a” are integers; whence 


B, + By’ = (a + 1)8, + arg ro + lim arg {ay’ + aj(x — a) + -::} 
I—Z 
arg (2—29)=8y 


(u = 1, 2). 


First consider the case in which, for all (7, 7) (¢ ¥ j), a’ * —1. Altering > 
to >’ by replacing 8, by 8, + (—1)*"A (u = 1, 2) changes 8, + 8; by the 
amount (—1)*"(a"’ + 1)A. Thus, by choosing A positive and sufficiently 
small, the quantities 8, + 8,’ corresponding to >’ will differ from 3a (mod 7) 
for all (7, 7) (¢ # 7). Hence, a portion of 2’ in which | x — 2 | is sufficiently 
small can be associated with a \ region containing Xp . 

Next, suppose that all a‘? = —1, ai’ = 0 (vy = 1).” In this case, the rays 
bounding = map into straight lines in the various £"’ planes, from which it is 
readily deduced that that part of = cut off by a straight line passing sufficiently 
close to 2x and of inclination 7 satisfying (4.8) can be associated with a closed 
region A containing \») as a boundary point. 

Finally, if the restrictions of the preceding two paragraphs do not hold so that 
some a’ are and some are not —1, it is now clear that there is a sector D’ differing 
by an arbitrarily small amount from = such that a portion of 2’ in which | x — 2 | 
is sufficiently small can be associated with a region A containing Xo . 

It remains to establish the existence of a fundamental region containing 2 . 
To this end, observe that if A and X are associated regions of the \ and z planes 
respectively, then any subregion of the suitable \ region having the same range 
of variation of arg \ as does A‘can also be associated with X. Also, the immedi- 
ately preceding discussion has shown that every point of the suitable \ region 
is contained in a region associated with a region of the x plane which may be 
assumed to contain that part of the sector 


12 Recall the assumption made in the statement of Theorem 1. 








—_— ss a 











MATRIC DIFFERENTIAL EQUATIONS 251 


b+ 22 = arg (¢ — 2) < 6, - 27 


in which | « — 2 | is sufficiently small. Hereby the total closed range of varia- 
tion of arg \ in the given suitable \ region is completely covered by subintervals 
corresponding to the associated subregions. But then a finite number of the A 
subregions possess intervals of variation of arg \ which together completely 
cover the total range in the given suitable region. The corresponding finitely 
many associated regions of the x plane can be associated with regions entirely 
covering the suitable \ region, and these regions have in common a region con- 
taining Xp . 

The usefulness of associated regions, as will be seen, depends upon the defini- 
tive properties of the points x’, together with the possibility of choosing the 
regions as described in the corollary immediately below. In a region X asso- 
ciated with a region of the \ plane, each point of an existing set x¥ can be 
joined to any point x of X by a path on the image of which in the respective £"’ 
plane in passing from £%’ to &'’(x) the abscissa is monotone decreasing. 


CoroLLaRyY TO THEOREM 1. Under the conditions of Theorem 1, there exist 
associated regions with the property that the paths in the x plane referred to in the 
preceding paragraph may be chosen to have the further property of being bounded in 
length collectively, for all index pairs (i, j), by some positive constant Ly which is 
independent of X. 


In fact the associated regions exhibited in the proof of Theorem 1 possess such 
a property. Observe that the regions X, constructed in the proof referred to, 


met J 


map, for \ = A», into regions ="’ in each of which every point £'’(x) can be con- 
nected with & by a path consisting entirely of rectilinear segments parallel to 
the axis of reals together with appropriately chosen portions of the boundary of 
the region ="’ in question. Let the images in X of these paths be taken not 
only for \ = » but also for all values of \ in the associated region A. It is 
quickly verified that such a choice is appropriate. First, the path so chosen to 
connect a specific x}’ with a specific point x maps into a curve in the region =’ 
connecting & with £‘’(x) and along which in passing from £¥ to £’(x) the ab- 
scissa is monotone decreasing. Secondly, the paths so chosen in X are independent 
of \. Thirdly, these paths are bounded in length. The last statement follows 
from the fact that the boundary of X is of finite length, from the fact that the 
mapping of each =" (A = Xo) onto X is analytic in the interior and on the 
boundary, it being a simple matter to show that such a map carries all rectilinear 
paths into curves of bounded length, and from the fact that there are finitely 


: a 


many regions = 


5. The solutions of the differential equation. Let » be the smallest integer 
for which the functions 
(5.1) (x — a)"{ri(x) — r4(x)}™ (t,j7 = 1, +--+, n;t ¥ J) 


are all bounded. 





252 HOMER E. NEWELL, JR. 


THEOREM 2. Let X and A be a pair of associated regions having the properties 
described in the corollary to Theorem 1. Then a set of conditions sufficient that for x 
in X, and for all d of sufficiently large absolute value’ in A, there exists a solution 
for (2.1) of the form 


(5.2) Y(z, \) = {1 + > ” P™ (x) + x* Bi (z, »)} E(a, d), 


where k is a positive integer, the matrices P(x) and B,(x, d) are analytic and 
bounded in x, B,(x, \) is bounded in d, and E(x, ) = (6;; exp {AR,(x)}), ts the 
following: 

(a) the functions r(x) (j = 1, --- ,n) have poles, if any, only at xo ; and the 
differences r(x) — rj(x) (i # J) have zeros, if any, only at 2X ; 

(b) the matrix Q(x, dX) is analytic in x; 

(c) the matrices Q(x) (v = 0, 1, +--+) are all analytic; and if » > 0, the 
matrices 

(x — a) “rT **"Q?(z) (v = 0,1,---,k—1) 

are all bounded. 


ry 14 
The formulas 


ps; = 6; (t,9 _ 1, EY ao n), 
( P v—l nn )} 

(5.3) pi} (x) = {ri(x) - ra) Pi gir (a) pt? (z)> G3), 
h=0 l=1 ‘J 

pi(z) => > ait ” (x)pi? (2) (v > 0), 


h=0 l= 


with any choice of constants of integration, define in succession, for vy = 0, 
1, --- ,k, a set of matrices P’’(x) which satisfy the equations 


(5.4) P’P?R— RP? +P -—VQe”P™ =0 ( =0,---,k- 1) 
h=0 

and the matrices of at least one such set will be analytic in X. This is apparent 

when 7 < 0. If 7 > 0, let the P(x) be constructed so that the elements of 


the matrices have zeros of the highest possible order at x). From hypothesis 
(c) of the theorem, it is readily verified that the elements of a P(x) so con- 


structed, where v is one of the integers 1, 2, --- , k, are analytic, having zeros 
at 2 of order at least (k — v)(n + 1). 
With a set of analytic matrices P” (x) (vy = 0, --- ,&), form the matrix 
k 
(5.5) P,(x, ) = DN’ P (2). 
v=0 


13 Henceforth the notation ‘‘|\ | > N’’ will be used to replace the phrase ‘‘d of sufficiently 


large absolute value’’. 
14 Op. cit., §5. These formulas and their use in the construction of equation (5.7) are 
adapted from Langer’s treatment. 








ties 
rz 
10on 


ind 
the 


ly 











MATRIC DIFFERENTIAL EQUATIONS 253 


Then define S(z, A, k) by the relation 


(5.6) S(x, A, k) = P,(x, A)E(a, d). 
It will be found that S satisfies the equation 
(5.7) <s = {AR+Q+)r“*A}S 
dx 


in which, P; being non-singular for | J | large, 


(5.8) A~ ~ {pe =~ at > gt 7 \ p 


v=0 


It is evident from (5.7) and (5.8) that the elements a;,(x, \) of the matrix 
A(z, \) are analytic in x and are bounded in z and X. Thus, in an obvious 
sense, (5.7) approximates (2.1) for | \ | large. 

Now it is readily shown that if Y(z, \) is a solution of 





(5.9) @ {SY} = —\"S' AY, 
dx 


then it is also a solution of (2.1). Upon integration, this becomes 


(5.10) Y(z, \) = S(x) — a | S(x)S(OAO)Y (e) dé. 
Setting 
(5.11) S(x)S(“)A(e) = K(z, &) 
in (5.10) and then iterating, one is led to the series 
(5.12) S(x) — vX*S?(xz) + *SP (x) — © F(—D) NS (2) Hee, 
where 
(5.13) s”( y= fo K(a, ¢)S° (¢) de (v = 1,2, ---) 


with S°(x) = S(z). 
Now put B(x) = J and define matrices B“’(x) for v = 1 by the equations 


(5.14) B’ (a) =| K(a, ¢)B°-? (OS(OS"(x) dé. 


Then, for all »y = 0, S” (x) = B(x) S(zx), and (5.12) takes the form 


(5.15) (7 — x“*B (xz) + X*B (xz) — +++ }S(z). 


Making use of the matrices I,, = (6,6,;) and recalling (5.11), (5.14) can be 
written in the form 


B® (2) = a S(2) In SQA (OBE (OSH) Iu Sa) ae, 


h,l=1 








254 HOMER E. NEWELL, JR. 


: Al . 
where, for the moment, the points x" are formally undetermined. 
If we note that 


S(x)InS (o) saa P,(2)I Py (Heme ® 


this becomes” 
(5.16) BY’(x) = De / P(x) In, Pr(QA(OB?” (O)Pi(OIn Po (ae - dt. 
hyl=i J zhl 


Now set’® 2"' = x4’ and choose the paths of integration to be bounded in 
length uniformly in \ by a positive constant Ly and such that on those corre- 
sponding to the index pair (h, J), R{E"(z) — &'(¢)} < 0. The possibility of 
satisfying the latter requirement is an immediate consequence of the definitive 
properties of the points x3 (h ¥ 1). Then along the respective paths of inte- 
gration the following relations hold: 


hl¢yy—thi A 
Gg One® | «1 (h,l = 1, --+,n). 


Thus (the matrices A, P; , P;' being bounded in z, and in X for | \ | large), 
if there exists a sequence of positive numbers 6” (v = 0, 1, --- ) such that 
for all v: 

| os? | so (i,j = 1,---,n), 


it follows directly that, for some positive m, and for | \ | large, 


j 1 (v1 a . 

| bs | < mb°” (i,j =1,---,n;v 2 1). 
But b can be taken as 1, whence by induction: 
(5.17) b}; | < m” = b” (i,j =1,--++,n;v = 0). 


. vy, —vk r= . . . —vk 
Hence the series }> (—1)’\""B” is dominated by the series }> \~™(m’) and, 
therefore, converges absolutely and uniformly in z and \ for x in X and | A | > N. 
Thus a function Y(z, \) is defined in X by the relation 


eo) 


(5.18) Y(x,) = D> (—1)’n-* B™ (a) S(a) = U(a, d)S(z, 2). 


v=0 
The function Y (x, \) so defined is a solution of (2.1). To show this, multiply 
(5.18) on the left by S”'(x), and differentiate. 


{S"Y}’ = (S"/US + S’U’S + SUS’. 
Since the series U(x, \) is uniformly convergent in z, it may be differentiated 
term by term. Simplifying, the above reduces to (5.9), whence Y(z, A) is a 
solution of (2.1). 
Finally, factoring out E(z, d), (5.18) takes the form (5.2). 


16 The notation ¢ is undefined for h = 1. Let it be given the interpretation: 


**=0(h =1,--:,n). 
16 The points 2,’ have been defined only for h ~* 1. Let them be chosen arbitrarily 


when A = l. 











MATRIC DIFFERENTIAL EQUATIONS 255 


CoroLtuary. Under the conditions of the theorem, there exist fundamental 
regions containing x wherein, for |X| > N in a suitable X region, a solution 
Y(x, d) of (2.1) exists such that ~ 

( k—1 ; 
nye @ NE (a,x) — > 1 P(x) 0 
uniformly in x as }—> «, where the P(x) (v = 0, --- ,& — 1) are analytic and 
bounded. 


The corollary is an immediate consequence of the fact that from Theorem 1 
and its corollary there exists a fundamental region containing x) which is the 
intersection of a finite set of x regions associated with regions covering all of the 
given suitable \ region and to each of which Theorem 2 applies. 


6. The case of a meromorphic Q(z, \). The results of Theorem 2 and its 
corollary may be generalized to cases in which the elements of the matrix 
Q(z, \) have poles. Let X and A be a pair of associated regions possessing the 
properties described in the corollary to Theorem 1, and suppose that for x in X 
the following conditions hold: 

(a) the functions r;(x) — r;(x) (i,7 = 1, --- ,n; 7 # j) are bounded from zero, 
having poles at x) on the boundary of X; 

(b) » S —2 (ef. page 251, line 4 up); 

(c) the functions qi;(x, \) (t, 7 = 1, -+- ,) have poles, if any, only at x». 

THEOREM 3. A solution for (2.1) of the form (5.2) with k = 1 exists in X for 
|X| > N in A af, in addition to conditions (a), (b), and (ce) above, the following 
conditions also hold: 

(d) the matrices Q”’(x) (v = 1) are analytic; 

(e) the matrix 


n—1 
(2 — mm) * Q(z) 
is bounded and 
(f) the functions 


at2 
(2 — 2) 3 qi; (x) (j = 1, ---,n) 


are all bounded. 

Given a pair of suitable regions in the x and planes, there exists, under the 
specified conditions, a fundamental region containing x in which, for || > N 
in the suitable X region, there is a solution Y(x, dX) of (2.1) such that 


Y(x, )E“(a, 4) > I 
uniformly in xasr\— ~, 


The conditions of the theorem admit of the determination, from formulas 
(5.3), of matrices P® (x) and P“’(x) which satisfy the relations (5.4) for » = 0, 1 








256 HOMER E. NEWELL, JR. 


and such that the elements of P‘(x) all have zeros at 2 of order at least 
—(n — 1)/3. Using matrices so determined, form the matrix 


P,(z, \) = p ” P™ (x). 


Then S(z, A, 1), defined by the relation 
S(z, A, 1) = P,(z, \)E(z, d), 


satisfies the equation 


(6.1) £8 = AR+Q+ AAS 

in which 
re ) 1 

(6.2) Aw {PG - 2x") Q"(a)P(2)} P;"(a, 2). 
h=1 v=0 ) 


Since the elements of P“’(x) have zeros at x of order at least —(n — 1)/3, 
it follows from (6.1) and (6.2) that the elements a;,(z, 4) of A(x, A) are analytic 
in x and are bounded in z and X. 

It is now apparent that the proof of the theorem can be made precisely similar 
to that used to establish Theorem 2 and its corollary. 

THEOREM 4. A solution for (2.1) of the form (5.2) with k = 2 exists in X for 
|X| > N in A if, in addition to conditions (a), (b), and (c), the following condi- 
tions also hold: 

(d) the matrices Q(x) (v = 2) are analytic; 

(e) the matrices 


n—2 
(x — a) * Q”(z) (v = 0, 1) 
are bounded and 
(f) the functions 
1+2 
(a — a) 2” gS?’ (x) (» = 1,2;7 =1,---,n) 


are all bounded. 

Given a pair of suitable regions in the x and planes, there is a fundamental 
region containing 2 in which, for || > N in the suitable d region, there exists a 
solution Y(x, X) of (2.1) such that 


A{Y(a, AE“ (a, s+) — I — x" P™(z)} 30 
uniformly in x as \ —> ©, where P (x) is analytic. 


It is readily verified that matrices P(x), P(x), and P® (x) can be con- 
structed from (5.3) to satisfy the equations (5.4) for »y = 0, 1, 2 and such that 
the elements of P (x) and P® (x) have zeros at 2 of order at least —(n — 2)/4. 
With matrices so constructed, form 











least 


1) /3, 
lytic 


nilar 


x for 
mdi- 


), 1) 


con- 
that 
) /4. 








MATRIC DIFFERENTIAL EQUATIONS 257 


2 
P(x, +) = >> x” P™ (2). 
v=0 


Then S(x, A, 2), defined by the relation 
(x, A, 2) = P2(x, A)E(a, d), 


satisfies 
£8 = {AR+Q+ 7 A}S 
ax 

in which 


A~ <P (z) — ps oo > orn" )P”(2)} P3'(z, d). 
v=0 

Since the elements of P“’(x) and P(x) have at x zeros of order at least 
—(n — 2)/4, it can be shown that A is analytic in z and is bounded in z and X. 
The proof of the theorem can now be continued as in the proofs of Theorem 2 
and its corollary. 

THEOREM 5. A solution for (2.1) of the form (5.2) with k an arbitrary positive 
integer exists in X for || > N in A if, in addition to conditions (a), (b), and (c), 
the following conditions also hold: 

(d) Q(x) = 0 (v = 1) and 

(e) the matrix 


is bounded, where m is some positive integer 

Given a pair of suitable regions in the x and planes, there exists, under the 
specified conditions, a fundamental region containing x» in which, for |X| > N 
in the suitable region, there is a solution Y (x, d) of (2.1) such that 


) 
’(a, d) )~{¥ 1” P™ (x) > E(z, d), 

v=0 J 
where the P(x) are analytic. 

Under the conditions given in the statement of the theorem, it is easily shown 
by induction that matrices P”’(x) (vy = 1, 2,3, --- ) may be determined to have 
respectively zeros at 2 of order at least —v(n — 1)/(2m + 1). Thus the 
matrix A which has the expansion 

A ~ {P" (xz) — Q(x) P™ (2)}Pi'(a, ») 
(obtained from (5.8) by setting Q”’(x) = 0 for v = 1) is analytic in z and 
bounded in x and X if k = m. Hence Theorem 5 follows as in the proofs of 
Theorem 2 and its corollary, for k = m, and thereby for any positive integer k. 


7. Extension to infinite regions. The results obtained for finite x regions in 





258 HOMER E. NEWELL, JR. 


the preceding two sections under prescribed conditions on the coefficient fune- 
tions of (2.1) are also valid for infinite regions under “equivalent’’, but not 
formally identical, conditions. This is seen as follows. 

In (2.1), let (2 — x) be replaced by —¢". The equation takes the form 


(2.1’) £ Y (ao — €",d) = {AR(x0 — EYE? + Q(ao — 7, NEP} (wo — €7, W) 
§ 


in which the new coefficient functions may be regarded as satisfying at «© condi- 
tions equivalent to those originally holding at the finite point x) , in the sense 
that the solution U(¢, 4) = Y(a — ¢', A) of (2.1’) has the same properties at 
¢ = « as does Y(z, A) at %. 

Since (2.1’) is of the same form as (2.1), the effect of the above transformation 
may be reproduced qualitatively by a restatement of conditions on the coeffi- 
cient functions of (2.1) so as to apply in the neighborhood of «. Thus, by a 
suitable rewording, the previous theorems may be stated so as to hold for 
infinite regions. For example, the theorem of Langer’s paper’ may be given 
the form: 

THEOREM 6. A set of conditions sufficient that there exist infinite regions X 
and A such that for x in X and for || > N in A there is a solution for (2.1) of the 


form 
Y(z, +) ~ p> x Pa} E(z, i), 
v=0 


the P’” (x) being analytic and bounded, is that for x in the neighborhood of ~ and for 
Ain A: 

(a) the matrix x°R(x) be analytic; 

(b) the differences r(x) — r(x) (i,j = 1, +++ ,n;t # J) have zerosatx = @ 
of order exactly 2; 

(c) the matrix x°Q(x, X) be analytic in x; and 

(d) for|X| > N: 


Q(z, d) ~ > Q(z), 


where the matrices x°Q (x) (v = 0,1, «++ ) are analytic. 


The conditions given in the above theorem are also sufficient that x = © be 
an interior point of the z regions in question. 


UNIVERSITY OF MARYLAND. 


17 Op. cit., p. 162. 





; 








-ondi- 
sense 
ies at 


ation 
-oeffi- 

by a 
d for 
given 


ms X 
of the 


ud for 


0 be 








A SELF-RECIPROCAL FUNCTION 
By R. S. Varma 
The object of this paper is to establish the following theorem. 
The function 
f(a) = a eT, (2) J (2) 
is' R, , provided R(v) > —1. 
We shall require the integral 
I= [ a’ te Wim(4x°) J ,(ax)J (ax) dx. 
0 
This can be evaluated by substituting for J,(ax)J,(ax) the equivalent infinite 
series (see [4], p. 380) 


F (—1)'T(p + ¢ + 27 + 1)(fax)"**™ | 
—orlpt+r+)r@qtrt+Drp+qtrtl) 


and integrating term by term by the help of the integral (see [2]) 
[ git eb W(x) dx 
0 


_Td+m+ rl — m+ 4) eo 4-1—k+1:;—a° 
0 TEED + + hl mt i - b+ 15-0) 


(l+m+4>0,|3(e)| < 1). 
We then obtain 
2" Tas + 4p + 4g +m + HPs + 4p + fg — mt 9) 


ee ee eS ee 
P(p + IL@ + ITGs + op + 9q — & + 1) 
2p + 39 + 2, ap + 39 + 1, 38 + 3p + 39 + m + 3, 
wm xan te pta+l, 


bp + 3g — m+}; oT 
as + apt iq—kt+1 
Term by term integration is justified by virtue of 
(2) | Wem(x)| = OC 2"), — | J(x) | = OC") 


and by virtue of the size of the terms in the series of J,(ax)J,(ax). Hence the 
result (1) has been shown to be true when R(p) > —1, R(q) > —1, and 


Received January 24, 1941. 
1 Following Hardy and Littlewood, we say that a function is R, when it is self-reciprocal 
in the Hankel-transform of order ». 


259 





260 R. S. VARMA 


Ris +p +q+2m-+1)>0. By the theory of analytical continuation, the 
first two conditions R(p) > —1 and R(q) > —1 may be removed. 
It may be noted that, since 
W,43(42°) od 2*x'Doy_s(2), 
the above integral reduces for m = +3 to a result deduced by me (see [3]) 


elsewhere. 
To establish the above theorem, let us assume that 


oe Wi.m(32°) J p(x) J o(2) 
is R, , i.e., that 
ao” Wam( 42°) J p(x) J q(x) 


(3) - 1 2 
= [ (xy)' J,(ay)y* et Wim oy )J p(y) J aly) dy. 
0 


If, therefore, we set 


f(s) = [ ae Wiem(42°) J p(x) J g(x) da, 
#0 
then 
yer re gob Wy (Fa? )I p(a)I gla) dx 


@ 


= a dx [ (xy) J (acy)y et Wim (by) I p(y) J o(y) dy 
0 Jo 


{ 


The change in the order of integration is permissible by de la Vallé Poussin’s 
theorem (see [1]) on account of the asymptotic estimates (2). Since (see [4], 
p. 383) 


ain te 2°T(3n + 3q + 3) ” 1 @ 
x’ J,(ax) dt = aTdn — iq +P [R(qg) < 4, Rn+q+1) >), 


yr} ty? Wim(ay J p(y) J a(y) dy [ a J,(xy) dx. 
0 


Se ere 
Py ae oe 2) A ye Wem dy) I oy) Fay) dy 


, (gv + $8 + 3) 
ry — $s + 2) 
Considering the equation (4) in conjunction with the result (1), we have, for 
the validity of (3), the equation 
PQA + 38+ bp + dg + m+ EGA + ds + bp + tg — mtd) 
x Pv — 3s + $)0GA + bp + dg — $s — +98) 


f(A + 8s) = 2 
(4) 
= 2 








the 


[3]) 


in’s 
[4], 


0), 


for 











SELF-RECIPROCAL FUNCTION 261 


= TA + $s + dp + fg — + Ty + fs +: 2) 
X PGA — 4s + 4p t ig tm + IPGA — $s + dp +4qg—m+1) 


which is satisfied by taking 


ti 


k =m + 3, A=v—2Qm—-p-q- 


Hence, 


v—2m—p—q—} —}2r2 yz 2 
gah tm (42) oS p(t) J q(x) 


is R,. Since 


W mat.m(x) = 2™ he 


we have, therefore, proved that 


gr Pte 7 (2) F,(2) 


is R, , i.e., that 


gat ot? 7 (x) J (2) =| (xy) J,(ay)y’? 4 e™ J(y) J ey) dy, 
0 


provided R(v) > —1. 


1 


9 


BIBLIOGRAPHY 


. T. J. VA. Bromwicn, Infinite Series (Second edition, revised), London, 1926, p. 504. 

. 8. GoLpsTeINn, Operational representation of Whittaker’s confluent hypergeometric function 
and Weber’s parabolic cylinder function, Proceedings of the London Mathematical 
Society (2), vol. 34(1932), pp. 103-125. 

.R. S. Varma, Some infinite integrals involving parabolic cylinder functions, Journal de 
Mathématiques, vol. 18(1939), pp. 157-166. 

. E. T. Warrraker anv G. N. Watson, Modern Analysis, Fourth edition, Cambridge, 
England, 1927. 


Tue University, Lucknow, Inp1a. 





THE STRUCTURE OF THE GROUP OF §-ADIC 1-UNITS 


By Davip GILBARG 


Introduction. It is well known that in local Q-adic fields the logarithm func- 
tion can be defined by the series 


o ’ 


—log (1 — 2) = *, 


v=1 

where x is a $-adic number, the series converging for all x with |x|, < 1. 

Although the logarithm is needed in various connections for algebraic number 

theory, very little is known about it. Even its value domain is still unknown, 
1 


except for this fact: log (1 — x) maps the set x for which | x |g < | p |?—' onto 
1 


itself in a one-to-one way. However, the mapping of those z for which | p|?—! < 
|2\— < 1 is still unknown, and it is this which must be investigated. 

Many of the explicit formulas for the reciprocity law in algebraic number 
fields are best stated by means of the $-adic logarithm.’ Although these 
explicit formulas have been proved, they are not clearly understood; it is prob- 
able that complete knowledge of the value domain of the $-adic logarithm 
would better our understanding of the formulas. This knowledge would be 
of use also for other applications to algebraic number theory. 

Let K be a -adic number field; then all units « which are congruent to 1 
modulo {—the 1-units of Hensel—constitute the set of elements in K having 
logarithms. If the structure of this multiplicative group of l-units were com- 
pletely known in some convenient way, then also the value domain of the 
¥-adic logarithm would be known. M. Krasner has attacked this problem,’ 
considering the general case where K is normal over a field k. His method was 





the following. Let G@ be the Galois group of K/k, o, 7, «++ , its elements, and 
form the group ring I of G taken over the ring of p-adic integers; consists of 
elements ¢{ = ao + br + ---, where a, b, --- are p-adic integers. If ¢« is a 
l-unit, then the hypercomplex power ¢ can be defined in the usual way, 
b b 
é& = (ce)*(re)” «++ = o(e’)r(e) --> 
with (¢') = &**, In this way, I is seen to be a ring of operators on the 


group of l-units. Krasner tried to find a minimal basis for the 1-units, taking 
the hypercomplex exponents T as operator domain; that is, he tried to find the 
. rr P os : 

fewest number of l-units « , @, ---, ¢-, such that ee --- «, give all 1-units 
in K (except perhaps for roots of unity). It was hoped that an independent 

Received July 28, 1941. The author wishes to express his indebtedness to Professor 
Artin and Dr. Whaples of the Indiana University mathematics department for their as- 
sistance in preparing this paper. 

1See [2] (numbers in brackets refer to the bibliography); see chapter IV on explicit 
formulas for the reciprocity law and the $-adic logarithm. 

2 See M. Krasner [5]. All references to Krasner have to do with this paper. 


262 











fune- 


< 1. 
nber 
own, 


onto 
L 


-l< 


nber 
hese 
rob- 
thm 
1 be 


to 1 
ving 
om- 

the 
m,” 
was 
and 
3 of 








STRUCTURE OF GROUP OF $-aADIC 1-UNITS 263 


asis could be found, namely, l-units « , --- , ¢, and possibly a dais root of 
unity 7, such that nidie? -+» & = 1 implies: a = 0 mod p’, %) = & = ++: = 
¢- = 0. Such an independent set will be called a normal basis. Could a normal 
basis be found, then the structure of the group of l-units in K would be com- 
pletely known in terms of these basis elements and their behavior under the 
automorphisms of K/k. 
Krasner was able to prove the following theorems: 
I. If K/k is of degree prime to p, then the l-units of K have a normal basis. 
II. Let K/k be regular, that is, without primitive p-th roots of unity. Then, 
if K/k is without higher ramification,’ the l-units of K have a normal basis 
over k. 
It was intimated that more general theorems than these would be proved at 
a later date. However, I shall show here that Theorems I and II cannot be 
included in more general simple theorems. Counterexamples will be given 
in very simple fields to show that the assumption on the degree of K/k in The- 
orem I cannot be dropped, and the converse to Theorem II will be proved. 
Since Theorem I is of some interest, I offer a simplified proof of it. The 
method here differs entirely from that of Krasner, and is relatively easy, although 
at one point a theorem requiring the theory of algebras is used. 


1. Groups of 1-units which have a normal basis. Although general normal 
fields K/k do not have a normal basis for 1-units, it is easy to show that every 
such field contains a sub-group of finite index (in the group of l-units) which 
does possess a normal basis. 

To see this, let H be the group of 1-units in K/k, and consider the logarithms 
of all elements in H. Since every 1-unit is determined, within a root of unity," 
by its logarithm, it follows that there is an isomorphism between the additive 
group of logarithms and the multiplicative group of 1-units modulo the roots 
of unity in H: 

H 
© = log H, 
where ¢ generates the roots of unity in H. 

Let G = o, r, --- be the group of K/k. It is well known that K/k has a field 
basis which is normal’ ga, ra, -:- , consisting of the conjugates of a single 
element, a. If 8, , 82, --+ , Bn is a basis for k/R, (R, being the rational p-adic 
field); then clearly the elements o(8;a) are a basis for K/R,. In particular, the 
integers of K are expressible by linear combinations, with coefficients in R, , of 


3 Let p and & be the prime ideals in k and K respectively; then p = $°, where e is the 
ramification index of K/k. If e is divisible by p, then K/k is said to have higher ramifi- 
cation. 

‘ Log z = 0 implies z is a root of unity with order a power of p (these being the only 
roots of unity that are at the same time 1-units). 

5 The normal basis meant here is a field basis, and is not to be confused with a normal 


basis for 1-units. 





264 DAVID GILBARG 


the o(8:a). The denominators of these coefficients are bounded by the discrimi- 

nant of the o(8;a). Consequently, on dividing each B,a by a high enough power 

of p (say the discriminant), a new set of basis elements, o(A,), is obtained, such 

that every integer of K is expressible as linear combination of the o(A;), with 

coefficients among the integers 0 of R,. Thus the group, 0 = > ® oo(A;), con- 
g,t 


tains the integers of K as subgroup, and also, therefore, the group log H, which 
consists only of integral elements. 

Contained in log H are all those integers of K which lie in the convergence 
domain of the 8-adic exponential function’ (for, log e* = x, provided e* con- 
verges). Now, if the basis elements o(A;) are multiplied by a high enough 
power of p, then new field basis elements, which we call o(B;), are obtained, 
and such that the entire group of elements in ym oo(B;) lies in the convergence 


gt 
domain of the 8-adie exponential, and is therefore contained in log H. But, 
by its construction, this group, >> oo(B,), is of finite index in the group @. This 


o,t 


shows two things which will be needed: (1) » oo(B;) is of finite index in log H, 
(2) log H is of finite index in the group 0 = oO oa(A;) (hence also in the integers 


of K). 

It follows from (1) and from the isomorphism H/(¢) & log H, that the multi- 
plicative group, exp >. oo(B;), obtained by taking the exponential of every 
element in >> oo(B,) is of finite index in H. The 1-units e; = e®* form a normal 


o,t 


basis for this subgroup. This is seen easily as follows. If I is the group ring 
generated by G@ over 0, then 


ef « ware et (e81)¥ (¢B2yF nes (eBn)? -_ et BitTBet---+TBn 


exp yo 0o(B;) 


and this representation for the elements of the subgroup is unique, since 


1 = ef'e? --- e* = e711 Bi +712B2 +--+ +7nBn 
implies that y: = y2 = --- = yn = O because e* = 1 only for x = 0 - 
> vB; = 0 only for all y; = 0. 


The above shows that every field contains a subgroup with normal basis of 
finite index in the entire group of l-units. Although this is not a strong state- 
ment about the structure of the 1-units, it is sufficient for local class field theory 


6 The f-adic exponential is defined by the power series for e*, the arguments being 


-adic numbers. The convergence domain for the exponential is the set of numbers z 
1 


such that |z|y < |p|?~!. For z in the convergence domain, e7=1 mod f, so e” is a 1-unit. 
A number is uniquely determined by its exponential. 











Timi- 
ower 
such 
with 
con- 


rhich 


rence 

con- 
ough 
ined, 
ence 


But, 
This 
g H, 


gers 


ulti- 
very 


rmal 


ring 


and 


s of 
ate- 
Ory 


eing 
rs z 


init. 











STRUCTURE OF GROUP OF ¥-ADIC 1-UNITS 265 


where Herbrand’s lemma is used for making index computation, and in the 
process requires knowledge only of the subgroup.’ 

However, when the degree of K/k is prime to p, it is possible to make the 
strong statement: 

THEeorEeM I. Jf K/k is normal, of degree prime to p, there are l-units «, 
@, °** , €, such that every 1-unit € is expressible in the form 

e = el'ey? +++ en", 

where ¢ is a primitive p'-th root of maximum order, y; € T, which is the group ring 
of G over the p-adic integers 0, and n is the degree of k/R,. Also, the expression 
is unique, so that 


tel eg"* e* = ] 
implies v = 0 mod p’, 11 = y2 = *** = ¥n =O. 


Proof. Again we concentrate our attention on the logarithms of 1-units 
rather than on the l-units themselves. As before, we have the isomorphism 
-- = log H. 

() 
It will now be possible to find an additive basis over I for the group of logarithms. 

Just as before, form the group >> oc(A;) = 0; this can also be written: 

o,t 


6 = FA; + TAs + --- + TA,. 
The expression for elements of @ is unique, since the o(A;) form an independent 
basis over R, . 

By its construction 0 is a left T-modulus and from (2) on page 264 contains 
log H as subgroup of finite index; log H is also a T'-modulus, since T log H = 
log H. Now, in the special case when the order of G is prime to p, the group 
ring I’ is a principal ideal ring (the proof of this fact will be outlined in the 
next section). It follows that log H has an independent basis’ over T' of n 
elements, say log «, , log e&, --- , log e,. Thus, 

7 The statement of Herbrand’s lemma is: 

Let G be a group, 7; and 7, homomorphisms of @ into itself, such that 7,:7,.6 = 77,6 = 
1; let @, be the subgroup for which 7,G, = 1, G, the subgroup for which 7.6, = 1. If, 
now, g is a subgroup of G, (G:g) finite, such that 7; and 7; take g also into itself, and g: , ge 
are defined in g just as G, , G2 were in G, then 

(Gy:71@) _ (go: 719) 
(Gi:T2@) (i: 29)” 


For local class field theory, G would be the.group of 1-units, g the subgroup having a normal 
basis. 

8 Here the well-known theorem is used that any submodulus of a modulus with a finite 
basis over a principal ideal ring also has a basis. Generally, it might occur that the basis 
elements of the submodulus have annihilating ideals other than (0), since the ring (T in this 
case) contains divisors of zero. However, in our case this cannot occur, as the final argu- 
ment shows. 





266 DAVID GILBARG 


log H = Tloga +Tloge + --: +T loge, = logge --- €., 


and hence 
\ Fr r. 
H = (Jee: +++ €n ; 
every element ¢ of H is thus in the form 


vy 
1'€ 


v2 Yn. 
e= ele” --- n 


No non-trivial relations can exist of the sort 


v1 
1 


72 Y 
ete? «+. 6% = 1, 


é 


for this would say that the N elements ¢«{ which generate H over o are not in- 
dependent, whereas the existence of a normal basis of n elements (over I) 
for a subgroup of H implies that there are at least N independent 1-units over 0 
in any set of generators for H. This proves the theorem. 

In the special case that k = R, and K/R, is a normal extension of degree 
prime to p, this gives the theorem that every 1-unit is uniquely expressible in 
the form, — = ¢’e’, wherey = aa + br + ---,witha,b,--- eo. In terms of the 
f-adic logarithm, this states that every logarithm can be uniquely written as 


log £ = a log ce + b log re + --- 


ac log e + br loge + ---. 


In other words, the value domain of the $-adic logarithm in absolutely normal 
fields is an o-lattice, generated over o by the conjugates of the logarithm of 
a single l-unit (or by the logarithm of the conjugates). 

In relative normal fields K/k of degree prime to p, the conclusion is almost 
the same, except thst, instead of one basis element, n are needed, where n is the 
degree k/R, . 

We shall see by simple counterexamples that such a representation is possible 
only part of the time, indicating that the value domain of the logarithm in 
general local fields is much more complicated than in the instance treated above. 


2. The group ring [. In the preceding section, the proof of Theorem I 
depended on the fact that T is a principal ideal ring. I outline the proof of the 
following more inclusive theorem: 


Let k be any p-adic field, 0 its ring of integers, and G any group of order prime 
to p. Then the group ring 0(@), generated by G over 0, is a principal ideal ring. 


Proof. The proof requires the theory of algebras. The group ring k(@), 
where G is taken over k, is a semi-simple algebra. By the Wedderburn structure 
theorems for semi-simple algebras [6], k(G@) is the direct sum of simple two- 
sided ideals, k(G) = % + % + --- + Y,; each of these simple algebras 
can be represented as a total matric algebra in a division algebra D over k. 
Since k is a p-adic field, its valuation can be continued in a unique way to any 
finite extension field, whether commutative or not; hence the division algebra D 

















STRUCTURE OF GROUP OF $-ADIC 1-UNITS 267 


ean be valuated. Thus D has a unique maximal integral domain 0p , which is 
a principal ideal ring; if $ is the prime ideal in D, then op consists of those 
elements x such that |z/y < 1. 

One proves easily that a total matric ring in 0p is a maximal domain of the 
total matric algebra in D. Let be such a maximal domain. Since bp is a 
principal ideal ring, it follows, without much difficulty, that © is also a principal 
ideal ring [3]. 

There are infinitely many representations for the simple algebra % as matric 
algebra in D, these representations differing by an inner automorphism. The 
following argument shows that every maximal domain of Y% is of the same type 
as 0, that is, is the ring of matrices in 0p for some representation of % as a total 
matric algebra in D. Let ©’ be an arbitrary maximal domain:of A. Then 
OH’ is a left O-ideal, hence OO’ = Oa, since © is a principal ideal ring. Then 
OD’ is a right O’-ideal, so DaD’ C Oa, and consequently aD’ C Da, there- 
fore OD’ Ca ‘Da. Since OD is a domain, so is a ‘Oa, and since D’ is maximal, 
it follows that ©’ = a ‘Da. Therefore, D’ is a maximal domain of the same 
type as © in the representation for 9{ obtained by transforming with a. 

Now, in the group ring k(G@), we have that o(G) is a maximal domain. This 
is seen as follows. The order of G is prime to p, hence is a unit in 0; the dis- 
criminant of o(@), which is a power of the group order, is then also a unit in o. 
Consequently, o(@) is contained in no larger domain, for otherwise the dis- 
criminant of o(@) would have to contain a square factor other than a unit. 
From this it follows that the components of o(@) in the simple algebras of k(G@) 
are maximal domains. 

The above arguments show directly that 0(G) is the direct sum of principal 
ideal rings, from which it is obvious that o(G@) itself is a principal ideal ring. 


Sy 


> 


3. Fields without a normal basis for 1-units. Krasner has proved the fol- 
lowing theorem. 

In a regular field K/k (that is, not containing a primitive p-th root of unity), 
the 1-units have a normal basis of n elements if K/k has no higher ramification 
(n is the degree k/R,). 

I show the converse here. 

In any regular field K/k with higher ramification, there exists no normal basis 
for the 1-units. 

Lemma 1. If the 1-units of K/k have a normal basis, then all 1-units of k are 
norms. 

Proof. Let « = e7'e]? --- ei" be a 1-unit of k, the e; being the normal basis 
for the l-units of K. Then e is left invariant by the automorphisms of K/k. 
But since the representation for ¢ is unique, cy: = Ty: = -«*: (¢, 7, «-: being 
all the automorphisms of K/k). This means that y; = a; (o + 7 + -:-), or 


e = (Neq)" (Ne) --+ (Nen)™ = N(e'e? «<> €"), 


where N stands for ‘‘norm’’. 





268 DAVID GILBARG 


Lemma 2. If K/k is purely ramified of degree p, then the index (e:NE) = p, 
where ¢ equals all 1-units of k, E equals all 1-units of K. 

Proof. For this we use the theorem’ that in local cyclic fields, (u: NU) = e, 
where wu equals all units of k, U equals all units of K, and e is the ramification 
index of K/k; in this case, since K/k is purely ramified,e = p. Let the number 
of residue classes mod § be p’; then k contains the p’ — 1 roots of unity, which 
form a complete system of representatives for the multiplicative residue classes 
mod §. Hence, if 7 runs through the p’ — 1 roots of unity, then u = ne, 
U = wxE. So, 

p = (u:NU) = (ne:nNE) = (e: NE), 
proving the lemma. 

To prove the theorem, now let K/k have higher ramification. Suppose 
K/k did have a normal basis for its l-units. Since the ramification order is 
divisible by p, there is a field K under K such that K is purely ramified of degree 
p over K."° We contend that the 1-units of K have a normal basis over K. 
Let « , --* , €, be the assumed basis for the l-units of K with group ring T as 
exponent domain; I is the set of elements > a,o. If G is the group of K/K, 


let G = GA + Gu + --- be a division of G into right cosets with respect to 
G, \, u, «++ being a complete set of representatives; then if I is the group ring 
of G taken over 0, it follows that 

r=P+urt:::. 


The units of K are given by: 


E = el e cee én 
= ef elu eee ef Pu eee cP elu eee 
= (e)T(f)t .-- (Q)T(G)T --- (A)T(A)T ---. 
Furthermore, the &, ¢, --- are independent over I, as one sees readily by 
reversing the above steps. Hence, the elements és, &, -++ are a normal basis 


for the l-units of K/K. From Lemma 1 (taking K instead of k), it follows 
that all l-units of K are norms, in contradiction to Lemma 2, for, according to 
the latter, a purely ramified extension of degree p (as in K/K) sends its norms 
of 1-units into a proper subgroup (of index p) in the group of 1-units of K. 
This proves the theorem. 


4. Counterexamples. From the preceding sections we have seen that any 


® For a proof of this important theorem of local class field theory, see [1]. It is this 
theorem for which the subgroup of section 1 can be used (see footnote 7). 

10 Proof of this fact requires the theory of the inertial group (the group T leaving fixed 
the largest subfield unramified over k). The order of T is e; since p divides e, then by 
Sylow’s theorem, there is a subgroup of order p, which therefore leaves fixed a subfield K 
such that K/K is purely ramified of degree p. 














STRUCTURE OF GROUP OF $-ADIC 1-UNITS 269 


regular field contains a normal basis for 1-units if and only if the field is without 
higher ramification. If the field is irregular, then a normal basis exists if its 
relative degree is notdivisible by p. That a simple more general theorem than 
the latter is not to be expected is seen for the examples of the following two 
fields: 

(1) R.(2'). This field is of degree 2, and ramified over R2. The field is irregular, 
containing the second roots of unity. The 1l-units of this field do not have a 
normal basis; however, this is not surprising since the field has higher ramification. 

(2) R2((—3)'). This field is of degree 2, unramified over R: , and also has no 
normal basis. This shows that ramification is not the sole conditioning factor 
for the existence of a normal basis for 1-units. 

It will be necessary first to express the l-units of these fields in a simple way. 

(1) Consider R,(2')/R2 ; its prime ideal is (2'). Every 1-unit of the field, as 
will be shown, can be expressed as a power product, with 2-adic integers as 
exponents, of the 1-units 1 + 2’, 5 (and possibly a factor —1). 

Any l-unit can be written in the form 

1 + a,2' + a,(2')? + ---, 
where the a; are units of the field or 0; a 1-unit in which a, is the first non-zero 
coefficient is said to be of order k. 

In this field, since there are only two residue classes of integers mod (2'), 
any l-unit of order k is a representative for the entire set of l-units of order k. 
Let the set m(k = 1, 2, ---) be such representatives for the l-units of every 
order. Then every l-unit is uniquely expressible as a product of the m,. For, 
let ¢ be any l-unit, say of order r; then e« = 7, mod (2')"*!; hence « = Nr€s , 
where ¢, is a l-unit of order s,s = r + 1; €«, = ny, mod ry". SO € = mre: , 
t => s + 1; proceeding in this way, we obtain « = 7,-.m --- , which is unique. 
Now we see that 1 + 2' and 5 generate 1-units of every order: 1 + 2! is of order 
1;(1 + 2)? is of order 2; — (1 + 2")? =l|]- (2')® — (2')* = a is of order 3: 
the order of a” is easily seen to be 2r + 3 for all 7; hence the powers of 1 + 2! 
generate l-units of every odd order. 5 = 1 + (2')! is of order 4; the order of 
5” is 2r + 4; consequently, 5 generates 1-units of all even orders = 4. Hence, 
between 1 + 2’ and 5 (with possibly a factor of —1), l-units of every order 
are generated. Thus, every l-unit can be written:” 


e = +(1 + 2')°5", 


where a and b are 2-adic integers. Such an expression is unique, for any rela- 
tion of the sort + (1 + 2')°5° = 1 would imply that 1 + 2’ and 5 can generate 
l-units of the same order, which is clearly impossible from the above. 
Now assume that the 1-units of R.(2') /R2 could be generated by a normal basis, 
and let the generating l-unit be + (1 + 2')*5"; aand b are 2-adic integers. Then, 
11 This is a special instance of a general theorem by Hensel which expresses the 1-units 
of a P-adic field by means of a finite basis over the p-adic integers; see [4]. 








270 DAVID GILBARG 
since every l-unit is assumed to be uniquely expressible in the form + [(1 + 
2')*5"|***", where o is the automorphism of R2(2')/R: , o” = 1, we also have 
+ [(1 + 2st = 5 
for some 2-adic integers a, 8. Since 
5 = 5° = + [(1 + 2)"5)"* 
and the representation must be unique, then as + 8 = a+ Bcora=8. Hence, 


5 = +[(1 on or 


or 
5 = +(—1)°5”. 
. £6°*" = 1. 
It follows that 2ba — 1 = 0, which is impossible with b and a@ 2-adic integers. 


This disproves the possibility of a normal basis for the 1-units of R2(2')/Re. 
(2) In R,((—3)')/R2, the prime ideal is (2) since the field is unramified. There 
are four classes of incongruent integers mod (2), which are represented by 0 
and the cube roots of unity, 1, w, #. Now any three incongruent 1-units of 
order k can be taken as representatives for all l-units of order k, and, as in 
the preceding example, every 1l-unit is uniquely expressible as a product of 
these representatives. We shall see first that 1 + 20 = (—3)' and 1 + 4w 
generate a complete set of representatives for l-units of every order and hence 
generate all l-units. 1 + 2w and —(1 + 2w)’ = 1 + 2are of order 1; (1 + 2) 
-(1 + Qw) = 1 + (1 + w)2 + 4w is also of order 1; these three are all incon- 
gruent mod (2). (1 + 2w)* = 1 — 4and 1 + 4are of order 2; (1 — 4)(1 + 4w) 
= 1+ (w — 1)4 — 16w is a third 1-unit of order 2, incongruent to the others, 
hence those three are representatives for the 1-units of order 2. One sees readily, 
as in example (1), that (1 + 4w)” and (1 — 4)” are 1-units of order r + 2 for 
every r; also (1 + 4w)” (1 — 4)” is of order r + 2 and these three 1-units being 
incongruent mod (2) represent all 1-units of order r +2. Thus 1 + 2w = (—3)! 
and 1 + 4w generate representatives for 1-units of every order; consequently every 
l-unit of R2((—3)') can be expressed in the form” « = + ((—3)*)*(1 + 4w)’; 
a, b are 2-adic integers. One sees readily that this representation is unique. 
Suppose the 1-units of R.((—3)')/Rz have a normal basis, the generating 
element being + ((—3)*)*(1 + 4w)’. Then, for some 2-adic integers a, 8, 


+ [((—3)')"(1 + 40)"]"* = ((—3)). 
Now, 
((—3)')” = —((—3)'), (1 + 4w)” = 1 + 40” = 13(1 + 40), 


so 
die ((—3))***** (1 = 4)? 13° _ . 


12 See footnote 11. 











ice, 








STRUCTURE OF GROUP OF $-ADIC 1-UNITS 271 


In R2,13 = 1 + 2° + 2° can be expressed as 13 = (—3)’, where c is a 2-adic 
integer, since the 1-units of R: can be generated by —3. Hence 13 = ((—3)*)*, 
and thus 

che ((—3) tot" we 4) = 1. 


From the independence of the elements, 1 + 4w and (—3)', it follows that 
b(a — 8) = 0; of the two possibilities, b = 0 or a = 8, the latter is clearly the 
only possible one since b = 0 implies that (—3)' generates all 1-units. However, 
a = B has as consequence that 0 = 2c + a(a + 8) — 1 = 2c + 2aa — 1, which 
is impossible for c, a, and a, 2-adic integers. Consequently, R2(( —3)*)/R2 cannot 
have a norma! basis for its l-units despite the fact it is unramified. 


BIBLIOGRAPHY 


1. C. Cueva.uey, La théorie du corps de classes, Annals of Mathematics, 2nd series, vol- 
41(1940), pp. 394-418. 

. Hasse, Bericht tiber neuere Untersuchungen und Probleme aus der Theorie der alge- 
braischen Zahlkérper, part II (Reziprozititsgesetz), Jahresbericht der deutschen 
Mathematiker-Vereinigung, vol. 36(1927), pp. 233-311. 

3. H. Hasse, Uber $-adische Schiefkérper und ihre Bedeutung fiir die Arithmetik hyper- 

komplexer Zahlsysteme, Mathematische Annalen, vol. 104(i930-1931), pp. 495- 
534. 

.. HENSEL, Die multiplikative Darsteilung der algebraischen Zahlen fiir der Bereich eines 
beliebigen Primteilers, Journal fiir die reine und angewandte Mathematik, vol. 
146(1916), pp. 189-215. 

M. Krasner, Sur la représentation exponentielle dans les corps relativement galoisiens de 

nombres $-adiques, Acta Arithmetica, vol. 3(1939), pp. 133-173. 

6. B. L. vAN DER WAERDEN, Moderne Algebra, vol. II, Berlin, 1930-1931. 

INDIANA UNIVERSITY. 


bo 
— 
— 


oe 
= 
- 


or 








AN EXPLICIT FORMULA FOR THE SOLUTION OF THE ULTRAHY- 
PERBOLIC EQUATION IN FOUR INDEPENDENT VARIABLES 


By GLYNN OWENS 


1. Introduction. Equations of the form 


n 3 n 2 
(1.1) Fee. Pr we (n = 2, m = 2) 
i=l 02; i=1 OY; 


are referred to as “ultrahyperbolic partial differential equations”. In 1901, 
G. Hamel [2]' investigating the problem of finding all geometries in which the 
straight lines are the shortest ones considered an equation equivalent to (1.1) 
(n = m = 2); he established the existence of a function which satisfied the 
equation in the neighborhood of the intersection of two characteristic hyper- 
planes and which assumed prescribed initial values on these planes, but the 
initial values are restricted to be analytic in two of their three arguments. 
It has only been in recent years that properties of the solutionsof (1.1) have 
been discovered that are not restricted by analyticity. 

In 1932, L. Asgeirsson [1] discovered a mean-value theorem which applies 
to any twice continuously differentiable solution of (1.1) (n = m). By the 
use of this mean-value theorem, it has been shown that on a non-characteristic 
hyperplane the values for any solution of (1.1) cannot be arbitrarily assigned 
so as to furnish a solution.” 

In 1938, F. John [3], using Asgeirsson’s theorem, was able to determine the 
most general solution, u, of (1.1) (x = m = 2) existing in all space; interpreting 
the independent variables as suitable functions of the Pliicker coordinates of 
a line in 3-dimensional space, u becomes a function of lines and Asgeirsson’s 
theorem takes the following form. If the line function u is a solution of (1.1) 
(n = m = 2), then for every hyperboloid H of revolution and of one sheet, the 
mean values of u for the two families of generating lines of H are equal. Calling 
a function of the lines of 3-dimensional space with the above property harmonic, 
John demonstrates that every harmonic line function which is twice continu- 
ously differentiable is equivalent to a solution of (1.1) (n = m = 2). It is also 
shown that the line integrals of a sufficiently regular point function form a 
harmonic line function and that a sufficiently regular harmonic line function 
is representable as line integrals of a point function. 

The present paper considers the equation (1.1) (n = m = 2). The variables 
of this equation are regarded as the coordinates of a 4-dimensional point that 
varies in a domain defined by an initial hypersurface and the characteristic 


Received August 26, 1941; presented to the American Mathematical Society, November 
22, 1941. 

1 The numbers in brackets refer to the bibliography. 

2? Asgeirsson’s results and the applications of his mean-value theorem are to be found in 
the book Methoden der Mathematischen Physik, vol. II, by R. Courant and D. Hilbert. 


272 











nV 


2) 


901, 
the 
1.1) 
the 
per- 
the 
nts. 
ave 


lies 
the 
stic 
ned 


the 
ing 
of 
n’s 
1) 
the 
ing 
Lic, 
1u- 
lso 
1a 
on 








FORMULA FOR SOLUTION OF THE ULTRAHYPERBOLIC EQUATION 273 


cone. Adapting H. Lewy’s [4] generalization of the “Riemann integration 
method” and applying it to the above domain, assuming the existence of a 
sufficiently regular solution in this domain, an integral formula is derived that 
gives the value of the solution at the vertex of the cone. The value at the vertex 
depends upon the values that the solution and a certain finite number of its 
derivatives assume on that part of the initial surface that is cut out by the cone. 
§2 contains the derivation of this formula which involves certain “Riemann 
functions” that are determined in §3. 

Under the above assumptions with regard to the solution of (1.1) (n = m = 2), 
it is shown in §4 that there must necessarily exist two equalities holding for the 
solution and its normal derivative on the initial hypersurface. That is, at a 
point on the initial hypersurface the value of the solution and of its normal 
derivative is dependent upon their values in a certain neighborhood on the initial 
surface. These results are explicitly carried out in the case of a hyperplane 
for which the two equalities take the form of two simultaneous integro- 
differential equations; by the means of an example it is shown that these two 
necessary conditions are not identically satisfied. 

To Professor Hans Lewy under whose guidance this thesis was written, 
I express my appreciation and thanks. 


2. Generalization of Riemann’s method and the explicit formula. Let the 
linear differential operator Liu] be defined by the following identity. 


Liu] = ttz,2, + Usgey — Uzgzy — Unyr, - 


Then Green’s formula as applied to the expression L{u] may be written as 


follows. 


(2.1) [ff (vL[u] — uL[v]) dr = If (0% —u 7) do, 


where dr is the volume element of the domain G and where do is the surface 
Ze 

element of the boundary O of G. The operator 3, 8 the well-known transversal 

derivative’ and is defined by the expression 


0 0x2, O Or. O _ O23 0 _ OX, ) 


(2.2) as dv dm | dv Ot, dv Ot; Ov xy’ 


where = are the direction cosines of the exteriorly directed normal to the 
v 


surface O. 
If Liu] is equated to zero, we obtain that 
(2.3) Uz, 2 + Uzer, — Uzgzy ~ Unxy = 0, 


that is, the ultrahyperbolic partial differential equation in four independent 
variables. In order to proceed with the discussion of this equation, we shall 


3 The transversal derivative is also referred to as the conormal derivative. 








274 GLYNN OWENS 


make use of the characteristic cone of (2.3). The characteristic cone C of (2.3) 
that has the origin of coordinates P for its vertex is defined as follows. 


2 2 2 2 
(2.4) X) + Xe — %3 — 24 = (. 
™ ‘ “4 ee 
In Green’s formula (2.1), let us replace u by , for abbreviation wz,2,, 
Ly OX, 
and v by wx ; if a summation over 1 S$ i Sk S 4 is now made, we obtain 


the result 


Ty t " 2 £4 loi Lus,24] —~ Uxgrp L{wix]]} } dr 
5) G ' 
w= Osis, Ok 
== If] lichhe _ Us.s, Fe } do. 


wiz are functions which will be stipulated presently and G is a bounded, simply- 
connected, open set that is defined by an initial hypersurface T and the char- 
acteristic cone C. We assume of course that the vertex P of C belongs to the 
boundary O of G and that it does not lieon [. For the sake of explicitness, it is 
assumed that G lies on a particular side of the cone C, namely, the side for 
which the following inequality holds for all points of G. 


(2.6) n+a—xn-—-z> 0. 


(2. 


On the characteristic cone C, the result of operating twice with the transversal 
aoa ; ie ; 
derivative, which shall be denoted by age? 18 the following. 
32 


9 4 9 9 4 9 
P 0x2;\" Oo Ox; OX, 4 

4 - > (#) . +2 > hho a : 
; = Cc 


Ox? iXk=1 dv Ov dx; Ax,’ 


Ox; . ‘ . : , 
where — are the direction cosines of the exteriorly directed normal to C and 


where 6; = & = 1,6; = 65 = —1. The functions w,, are now required to satisfy 
the following conditions on C. 


(%) fort =k 
v 


(2.7) wik = 3 (lsisk<s4) 
OX; OX, , 
i... Se for i , 
[ad S > aide 
That is, on the characteristic cone we have that 
a a 
2.8 — = . = 
(2.8) 0s? ieStee MY dada 


In addition to the conditions (2.7) it is now demanded that the functions wy 
be solutions of the ultrahyperbolic equation (2.3). That is, 








— af 


V 





(2.3) 


‘Zirh 


tain 


> do. 


ply- 
nar- 
the 
it is 
for 


‘sal 


nd 


sfy 





FORMULA FOR SOLUTION OF THE ULTRAHYPERBOLIC EQUATION 275 


(2.9) Llwu] = 0. 


Accordingly, with the understanding that the solution u of (2.3) is a sufficiently 
regular function, the equality (2.5) may now be written as 


Ou, z Owix \ 
2.10 th @; - i7k a estp ~ “ _ 
List mae . Os Maize 0s } ” 0, 


where the boundary O of G has been replaced by C + IL. In order to carry out 
explicitly the integration over the characteristic cone C, we make use of the 
following equivalent definition of C. 


M1 = 7 cos 0, x3 = peosy, 
(2.11) 
Y2 = rsin 6, 4% = psiny (Osr,p< @) 

? ont ° ° ° ee 

with r = p = 2°’s. It is to be noted that s is the distance from the origin to 
the point on the cone whose coordinates are (2 , 2, 23, 2%). It is also noticed 
that for fixed @ and y the locus of the equations (2.11), with r = p, is a generator 
of the cone. Now transversal differentiation on C is easily seen to be equivalent 
to differentiation with respect to s. That is, differentiation with respect to 


: ‘ . 0 ame 
length along a generator of C is equivalent to the operation ~ The element 
0s 
of area on C is 
do = 3s‘ds dé dy, 


; my OWik . , 7 : 
and since by (2.7) “eke — 0 on ( , the integral over C in (2.10) is 


os 


IL 21 tse Wik Uz »}} do = iE 363 s dsdédy, 


because by (2.8), on C, 


Wik Uz; zy 
sisks4 
is precisely the second derivative of u with respect to s, the distance. Let 
p = p(0, ¥) represent the distance from the vertex P of C to the point where 
the generator determined by @ and y (see (2.11)) meets the surface I’; then, 
by the use of partial integration, the following expression is obtained. 


p 43 
.* s'ds =p ou — 2p S* + 2u — 2u(P), 
0 os* Os? 


where the arguments (2; , 22, 23, 2s) of the first three terms on the right are 
respectively replaced by 2» cos 6, 2*p sin 8, 2» cos ¥, 25 sin y and where 
u(P) means the value of u at P. Making use of (2.10), we find that 








76 GLYNN OWENS 
dad ft Ou ™ 
i[ I ( 2p = - 2u) d@ dy — 4x u(P) 


Os? 0 
OUz; 2, 
ePPTL 3 Jodie 
J \asi<ks4 Os 
. 


IWik 
u OW ik \ do = 0 
Zirzk Os ’ 


or 


_ 22 ple 97 f) 
8ru(P) = [ [ (0"' : — er + 2u) do dy 
0 0 0s? Os 


( Ou Own. ) 
. [Tf ik “vt Uz;z - ? do, 
i lichen oll as If 
r 


where in the first integral of (2.12) = refers to transversal differentiation on C 
Os 


. : 0 : ar 
and in the second integral : refers to transversal differentiation on the surface I. 
os 


3. The Riemann functions. In the interior of the domain G of §2 the fune- 
tions w satisfy the equation Llwx| = 0, that is, the ultrahyperbolic equation, 
and on the characteristic cone the following characteristic initial values: 


| (= ‘) for i = / 


(3.1) On = (lsisks4) 
[aaa Se for i = ky 
See (2.7) and (2.9). Let us now make the following transformation of co- 
ordinates: 
21 = 7 cos 8, Y3 = pcos y, 
Xe = rsin 8, % = psny Osrp< »). 


The equation (2.3) is transformed into 


Ur u u u 
(3.2) Unt— + —u, —- 2 — —* =0. 
r ’ # ee 
The boundary conditions (3.1) become: 
cos’ 6 s ‘ 
wn = 5? wy = sin 6 cos 8, wi3 = cos@cosy, wy = cos@siny, 
sin’ 6 , , . 
(3.3) we = ,  @s3 = sin @ cos wo = sin 6 sin 
2 ’ 5] 
2 - 2 
cos , sin 
w33 = =~. w34=sinycosy, wu= ae 











> do, 


on C 


ce I. 


une- 
tion, 


we 
ra 


co- 


ny, 








FORMULA FOR SOLUTION OF THE ULTRAHYPERBOLIC EQUATION 277 


Solutions of (3.2) which assume the values (3.3) and which are regular in the 
domain (2.6) are required. The domain (2.6) is equivalent to the domain 
r> p20. The boundary C of (2.6) is equivalent to r = p. To find these 
functions it suffices to try one or the other of the two following forms for wx . 


constant + f(@, Wg () 


Wik 


or 


Wik 


constant + f(6,W)g (2 re r'). 
ia 


The resulting Riemann functions w,, are given in the following table: 


1 , cos 26 p 2 2 
= - nd 1 am ti ] (r* — n°) ° 
m =F + ri | + (1 E) og V p | ; 


— = E + (1 E) log (r? — »')|; 


p 


— cos 6 cos yy; 
r 


O13 = 


ou = P cos 6 sin v; 
r 


w22 = ; - mel + (1 — PY tog (r? — P| 
w23 = ? sin 8 cos y; 
wo = P sin @ sin vy; 

r 


ant > Oe 1 + (5 _ 1) log (1 - é)] ; 
4 4 p r 
wu = 2 |} + (5 - 1) log (1 - E) |: 
2 p r2 
_1_ cos a -®)| 
Ou = ri 4 E ad (5 1) log (1 2 ° 


It is to be noticed that the trigonometric terms in wss , w34 , and wy are undefined 
for p = 0, but ws3, ws, and wy are defined. This follows from the fact that 
the common term multiplying the trigonometric factors has the factor p. One 
observes that the derivatives of certain of the w with respect to the variables 
z; are infinite on the cone C. Hence the following remarks are necessary to 
justify the validity of the use of Green’s formula in §2. Consider the domain 
defined by B, that is, 











278 GLYNN OWENS 


twita—-2—-m2a 
and the surface [ of §2. As a — 0, this approximating domain tends to that 
defined by C and [. Green’s formula is valid in this domain which lies interior 
to our original domain. On the boundary B* of B, the transversal derivatives 
Of ws3 , ws, and wy are zero by virtue of the homogeneity of these functions. 
The transversal derivatives on IT of the six w mentioned above are logarith- 
mically infinite at the intersection of C and I’; hence the surface integrals of 


OUz;2, =i =) 
maa ~ — & 


over the boundary of the approximating domain that lies on T tends to a limit 
as a — 0. Although the transversal derivatives of wy, w2 and we are not 
zero on B* , nevertheless, it is an easy consideration to show that as a — 0 


e Owik 
lim If je Uzi do = 0. 
B* 


Consequently, the manner in which Green’s formula has been used is justified. 


4. Necessary conditions for the initial values. It has been shown by means 
of Asgeirsson’s mean-value theorem that an arbitrary assignment of initial 
values on a non-characteristic hyperplane does not lead to a solution of the 
ultrahyperbolic equation. I propose to investigate further this question of 
suitable initial values. Necessary conditions will be derived for the following 
problem. What initial values are permissible in order to ensure a solution of 


the equation (2.3)? 
Let us commence by considering Green’s formula (2.5) as used in §2, namely, 


SSSI ty _, [oe Llaveal — Wesee Dll} dr 


OUz; 2, Owik 
_ He ik . — Uziz do, 
If] = 0s Maize Os } ‘. 
Oo 


where the notation has the same meaning as in §2 except that now the vertex, 
the origin of coordinates, P of C lies on the surface T. By the results of §2, 
it is permissible to write (4.1) as 


f OUz,2, Owik \ st TT au » 
7. . = z;z ( 7: —_ Is 1 = ° 
(f1\,%. as _— De ] do+>5 gr * 18d0 dy 0 
r e 
By the use of partial integration as in §2, we find that 


t{ du s'dsd0dy = sii ( 20°u 9 du 2) do dy — u(P) If do dy 
2/. a8 2 P as? ? as ' 


and accordingly our first equality is obtained. Namely, 


(4.1) 











. that 
terior 
tives 
‘ions. 
with- 


fied. 


eans 
‘itial 

the 
n of 
ving 
n of 


ely, 


dy, 





FORMULA FOR SOLUTION OF THE ULTRAHYPERBOLIC EQUATION 279 


20 u Ou 
1(6, w)u(P) “3ICZ — 2p os 2u) do dy 


; d zz ry 
+/f fi > Met — iy SH ao, 
: lisisks4 Os 2 


where I(6, ¥) = I! dé dy and where this double integral depends only on IT 


(4.2) 


and P. Now to obtain the value of u,,(P), we need merely to substitute w., for u 
in equation (4.2), and for convenience let v = uz,(P). One finds that 


1(0, ¥)us,(P) = an (ee 2d'v _ 2p < ° + 2») dady 
(4.3) ( 
OV:;2; ave Owir 
+f] Cz. 5 — hans FE oo 
r 


Ox; : : ‘ , 
If - denote the direction cosines of the normal to T, then the normal deriva- 
Vv 


a — 
tive “ on I is given by 
Ov 


Ou _ Ox; Ou 


(4.4) —o cin’ 





Equations (4.3) and (4.4) determine an equality which the normal derivative 
of uw must satisfy on I. 

A definite case is now to be treated. A domain G(R) that is defined by a 
hypersphere of radius R, the characteristic cone with vertex at P, and the 
hyperplane x, = 0 is to be considered. If the transformation of coordinates 


x, = 7 cos 8, Ys = pcos y, 


(4.5) ; , 
Xt =rsin 8, m= psny (OSr,p < ») 


is made, then the analytical definition of G(R) is 
r+ p < R’, 
(4.6) r—p >O0, 
cos 6 > 0. 


If the domain G(R) is taken as the region G of (4.1), then it is found that 


Ou: Zk Owik 
} 2 io : “— Wagan a. d. 
kee Os “site Og } . 
S+P 


‘2 I/ ou sdsdédy = 
2 0s* oa 
Cc 











280 GLYNN OWENS 


S, C and P mean, respectively, that part of the boundary of G(R) that lies on 
the boundary of the sphere, the cone and the plane. The arguments of u in the 
last integral as in §2 are (4.5) with r = p = 27s. For fixed @ and w and with 
r = p = 2"'s, (4.5) defines a generator of the characteristic cone passing through 
P. This generator meets the surface S at a distance s = R from P. Accord- 
ingly, it is permissible to write 


R 3 
[ 7 * Jao ro “_ on™ 4 ou(R) — Qu(P). 
0 3 33 Os 


Since cos @ > 0, — 


+L. ao f° (ee 'U _ op = + 2u ) ay ~ 2x? u(P) 


Ouz Lk Owik \ 
ik —* — Uz,2, — |? do = 0. 
aii eee | ds Mei | do 


S+P 


"<0< 5 * and (4.7) becomes 


wha 


(4.8) 


Assume now that u and its derivatives vanish sufficiently rapidly as R tends to 
infinity. This assures the vanishing of all integrals of (4.8) that are taken on S. 


Hence, 


2 Owik OUz;2 ) 
(4.9) 2m u(P) = [fl lashes & a Wik Ae |} d2z2dx3dx4, 
A 


° 4 ° ° P 2 2 ° 
where A is the 3-dimensional domain z2 — p’ > 0. Placing v = uz,, we find 
that 


Owik Ov 
2r uz = fff - — wi tt |\ deed day. 
(4.10) ” Uz, (F Lal Fie, Vz, ry as Wik ae | dx. dx3dx4 


Therefore the initial values satisfy a pair of simultaneous integro-differential 
equations. 

If (4.9) and (4.10) were identically satisfied by all functions u and v which 
are sufficiently regular and which vanish properly at infinity, then these equa- 
tions would not be restrictions on the initial values. It suffices to show that 
(4.9) is not satisfied for arbitrary v, subject to the above conditions. Consider 
that part, M(v), of the integrand of (4.9) that does not contain u or its deriva- 
tives.’ Namely, 





Iw dung 
wi (Uz.2, — Us ,s3 ~ Vz,2,) + Vz 7% — Uz, 
Ox Ox; 
(4.11) M(v) — es Owi4 . 
} Vrzg — W22VUzg2z. —~ W23Vzg2z3 —~— W24Uzgz 
02 4 2-2 273 274 
| 
| —wss 02525 — 34 Uz323 ~<A 44 Vz,2,4 
re) re) 
‘u and v are initial values prescribed on the plane x; = 0 and the operator ao? - 
1 


on 2, = 0. 








Ss On 
n the 
with 
ough 
-ord- 


‘ind 


tial 


ich 
1a- 
1at 
ler 
ya- 





FORMULA FOR SOLUTION OF THE ULTRAHYPERBOLIC EQUATION 281 


Since (4.9) is now assumed to hold for arbitrary v, the integral I(v) of M(v) 
over the interior of the cone A must vanish for arbitrary v. v will now be 
defined so as to provide a contradiction. Let 


((? — 2)* forza? <Pandr+ psp’ ) 
0 for 2; > andr’+p’ < 
(4.12) v= P . 


| Continue v outside of the circle r + p S p in a sufficiently 
regular manner so that v vanishes outside of a circle r + p = | 
.6, where 5 > p. 


where a is a sufficiently large number. It is now to be shown that J(v) does not 
vanish. Making use of (4.12) one finds that 


. Ow} 
(4.13) I(v) = IT | (oe —_ W11)Uzo29 - = ta | dreds dre, 
A 


and that for x, = 0, 


1 : , 
w2 — On = ab a (1 - 4 log (x2 — P|, 


dure _ 1 1-8) 7? a 
—* 1h + (1 7 log (x2 — p’) }. 


Substituting the quantities in (4.14) into (4.13), I(v) becomes 


(4.15) I(v) = If [1 + (1 - a) log (23 — p) | toa - - tay | doa dad. 


By the use of the definition of v, it is found that 


1 


(4.14) 


Bes, — — Us = a(l” — 29)" + Qala — 1)a( — 23)* 
and hence 
I(v) = 2 IT [1 + (1 - 4 log (22 — P| 
Zg—p>0 7 


‘[a(l? — 23)** + 2ala — 1)a3(l? — x2)* "|p dpdy dz, 


observing that the transformation zx; = p cos ¥, x; = p sin y has been made. 
To show that J(v) does not vanish it suffices to show that the next integral J 
is zero at x2 = 0 and is monotone. 


” ~_ 7 P 2 
r=] |p + p(t PY tog (a3 p*) | ap. 


Evaluating J we find that 








282 GLYNN OWENS 


(4.16) J = 323 + 423 log x. 


From (4.16) it is obvious that J has the desired properties provided that 1 
(see (4.12)) has been chosen sufficiently small. Consequently, the integral I(v) 
does not vanish and we have a contradiction. 

In conclusion, I would like to say that the methods of this paper have been 
successfully applied to the ultrahyperbolic equation in five independent 
variables. 

BIBLIOGRAPHY 

1. L. Ascetrsson, Uber eine Mittelwertseigenschaft von Lésungen homogener linearer partiel- 
ler Differentialgleichungen 2. Ordnung mit konstanten Koeffizienten, Mathe- 
matische Annalen, vol. 113(1936), pp. 321-346. 

2. G. Hamet, Uber die Geometrieen, in denen die Geraden die Kiirzesten sind, Mathematische 

Annalen, vol. 57(1903), pp. 231-264. 
’. Joun, The ultrahyperbolic differential equation with four independent variables, Duke 
Mathematical Journal, vol. 4(1938), pp. 300-322. 

4. H. Lewy, Verallgemeinerung der Riemannschen Methode auf mehr Dimensionen, Nach- 
richten von der Gesellschaft der Wissenschaften zu Géttingen, Mathematisch- 
Physikalische Klasse, 1928, pp. 118-123. 

UNIVERSITY OF CALIFORNIA, AT BERKELEY. 


9 
Le] 











GENERALIZED ARITHMETIC 
By GARRETT BIRKHOFF 


1. Introduction. Since the time of Cantor, it has been the fashion to divide all 
arithmetic at the root into two separate branches: cardinal and ordinal. Each 
of these branches is supposed to have its peculiar operations of addition, multi- 
plication, and exponentiation. Only as an afterthought are the two branches 
connected, by a roughly’ homomorphic correspondence from ordinal arithmetic 
to cardinal arithmetic, which is ismorphic when restricted to finite ordinals and 
cardinals. 

In the present paper, an entirely different point of view is advanced. Instead 
of giving finite and transfinite arithmetic a split personality, half ordinal, half 
cardinal, I believe that one should regard both aspects as fragments of a unified 
general arithmetic of partially ordered systems. 

I should like to stress three arguments in favor of this point of view. 

In the first place, what are usually considered as purely cardinal operations 
extend in a natural way to ordinal numbers and other partially ordered systems, 
and vice versa. Moreover, when applied to the wider context of general 
partially ordered systems, the six operations of “generalized arithmetic’ are 
found to have important new applications.” The variety and importance of 
these will stand comparison with the applications of transfinite arithmetic, as 
that term has been understood heretofore.’ 

. In the second place, almost all arithmetical laws which are valid in transfinite 
arithmetic, as that term is understood now, are equally valid when the operations 
are applied to the most general partially ordered systems. In fact, the big gap 
comes between ordinary arithmetic and transfinite arithmetic; much more is 
lost by admitting infinite numbers as legitimate objects for arithmetic operations 
than is lost by including partially ordered sets in the middle ground between 
totally ordered sets (finite ordinals) and totally unordered sets (finite cardinals). 
Moreover, the slight loss is more than compensated by the availability of new 
cross-laws connecting cardinal with ordinal operations. 

Finally, adoption of the broader point of view towards arithmetic developed 
b low fits the traditional transfinite arithmetic into the general framework of 


Received October 18, 1941. 

1 Ordinal exponentiation does not quite fit into this statement, and involves special 
complications. Also, the proof of this connection involves the well-ordering principle 
(axiom of choice). 

2 Much may be found about the extension of cardinal operations and applications of the 
extended definitions in [1]. However, the scope of the present program is nowhere sug- 
gested in that paper. 

3 The need for the operations of transfinite arithmetic was never very great; the need 
in topology for even transfinite ordinals has now largely disappeared, thanks to the in- 
creased use of the more effective and simpler tool of Moore-Smith convergence. 


283 








284 GARRETT BIRKHOFF 


modern algebra. This gives fresh insight into known facts, and suggests new 
problems and results. 


2. Numbers, subnumbers, and homonumbers. Let us agree to mean by the 
word number any non-void partially ordered set. That is ([3], Chap. VI, §2, 
or [2], p. 5), a “number” is a set A of elements z, y, z, --- , connected by a re- 
flexive, transitive, and anti-symmetric’ relation x = y. Numbers will be 
denoted by italic capital letters throughout the sequel. 

We shall call two numbers A and B equal (in symbols, A = B) if and only if 
they represent isomorphic partially ordered sets. This relation has the usual 
reflexive, symmetric, and transitive properties; moreover, it conforms to accepted 
usage. 

The usual meaning of inequality between cardinal numbers and also that 
between ordinal numbers appear as special cases of the concept of subnumber 
as now defined. 

Derrinition. A number A will be called a subnumber of a number B (in 
symbols, A C B) if and only if A is isomorphic to a subset of B. 

We shall state without proof the evident 

THEOREM 1. The relation of being a subnumber is consistent, reflexive and transi- 
tive; unity is a subnumber of every number. Formally, 


(1) of B = C, then A C B implies A C C and B C D implies C C D; 
(2) for all A, A CA; 

(3) if AC Band BCC, then A CC; 

(4) forall A,1 CA. 


The concept of subnumber may be supplemented by the closely related con- 
cept of homonumber, suggested by the general ideas of modern algebra. This 
concept, in the present context, is new. 

DeFInition. A number A will be called a homonumber of a number B (in 
symbols, A < B), if and only if there is a many-one or one-one correspondence 
of B onto A which preserves order, so that if a and a’ are the images of b and b’, 
respectively, then b 2 b’ in B implies a 2 a’ in A. 

THEOREM 2. The relation of being a homonumber is consistent, reflexive and 
transitive; unity is a homonumber of every number. Formally, 


(5) if B = C, then A < Bimplies A < C and B < D implies C < D; 


(6) for all A, A < A; 


4 It has been pointed out to the author by J. W. Tukey that most of the results below are 
independent of this restriction, and indeed the definition of ordinal exponentiation is 
simplified. However, this further generalization will not be made here since it might 


be confusing to many. 








GENERALIZED ARITHMETIC 285 


(7) of A < Band B < C, then A < C; 
(8) for all A, 1 < A. 


We note that the ordinal w is not a homonumber of w + 1, although it is a 
subnumber of it. For, any homonumber of w + 1 would have a last element 7, 
whereas w has none. 

We now come face-to-face with the principal properties of cardinal and ordinal 
numbers, which are lost in the generalized arithmetic developed here. First, 
the relation C, which is anti-symmetric for cardinals (Bernstein’s theorem) 
and ordinals, is not anti-symmetriec’ in general. Second, it is obviously not 
true that for all A, B either A C B or B C A, although this is well known to be 
true for ordinals ({2], Theorem 1.8), and hence (using the well-ordering principle) 
for cardinals. In summary, the comparability property is lost.° 


3. Cardinal and ordinal addition. The usual definitions of addition for 
vardinal and ordinal numbers generalize in obvious ways to arbitrary partially 
ordered sets.’ 

Dertnition. By the cardinal sum of A and B (in symbols, A + B) is meant 
the number consisting of all the elements in A or B, where inclusion within A 
and within B keep their original meaning, while neither a 2 b nora S b holds 
foranyaeA,beB. By theordinal sum A @ B of A and B is meant the number 
consisting of all elements in A and all those in B, where inclusion within A and 
within B keep their original meaning, while a > b holds for all ae A and b ¢ B. 

Thus for finite numbers, we can construct the diagrams [2] of A + B and of 
A @ Bas follows. That of A + B is obtained by laying the diagrams of A 
and B side-by-side, that of A @ B by laying that of A above that of B and 
drawing lines from all minimal elements of A to all maximal elements of B. 
Thus the graph of a cardinal sum is always disconnected; while if A has a least 
element o and B a greatest element 7, then the graph of A @ B has a “node’’, 
or line which, if severed, would disconnect the graph. The converses of these 
also hold. 


THEOREM 3. Cardinal and ordinal addition are both single-valued, isotone, and 
associative. Cardinal addition is commutative, and homomorphic to ordinal 
addition. Formally, 


(9) if A = BandC = D,thnA+C=B+DandA@QC=BOD; 
(10) ff A CBandC CD,thnA+CCB+DandA@CCBOD; 
(11) ff A < BandC < D,thnA+C<B+DandA@OC<BOD; 


5 Let A consist of all fractions m + 1/(n +1), where m and n are positive integers; let 
B consist of all those with m > lorn = 1. Then A C Band BCA, yetA #B. 

6 Slightly more is lost: the fact that the cardinals and ordinals form well-ordered sets 
under the relation A C B. 

7 For the usual definitions, cf. [3] or [4]. The generalizations were outlined in [1]; they 


are obvious enough. 








286 GARRETT BIRKHOFF 


(12) A+(B+C) =(A+B)+CandA@(BOC)=(AGB) OC; 
(13) A+ B= B+4; 
(144)A @B< A+B. 


Proof. The proofs of (9)-(11) follow by just putting together the correspond- 
ences between the summands. Formulas (12)—(13) are obvious; the correspond- 
ence giving (14) is just the identical correspondence from A to A and B to B. 

Remark. It is known that w = 1 © w ¥ w @ 1: ordinal addition of infinite 
ordinals is not commutative. It is also true that ordinal addition of finite 
cardinals is non-commutative. 

Finally, we can state some results connecting inclusion with addition, which 
are true except for the combination of homomorphic inclusion with ordinal 
addition. We have 


(15) ifA CB+C,thenA = D+ E,whereD CBandE CC; 
(16) ifA CB @C,thenA = D@E,whereD CBandE CC; 
(17) if B+ C < A,thenA = D+ E, where B < DandC < E; 
(18) ACA+B,BCA+B,A CA @B,andBCA OB; 
(19) A<A+B,andB<A+B. 


Of these results, (15)-(17) assert the hereditary nature of decomposability, and 
(15)—(16) are converses of ( 10)-(1 1). While (18) would be trivial if one admitted 
the void set 0 as a number,’ it would then be a corollary of (10). 


4. Digression: unique decomposition theorems. It may be shown that 
decomposition into either cardinal or ordinal summands is essentially unique. 


THEOREM 4. Any two decompositions of a number A into cardinal summands 
or ordinal summands have a common refinement. 


CoroLLARY. A number has at most one cardinal or ordinal decomposition into 
indecomposable summands; in the finite case, this always exists. 


Proof. It is easy to show that a partition of a “number” A represents a 
decomposition of A into cardinal summands if and only if x = y implies that 
x and y are in the same subdivision of A. But it is easy to show that the product 
(in the ordinary sense) of any two partitions with this property itself has this 
property, and is the desired refinement. Similarly, a partition of A represents 
a decomposition of A into ordinal summands if and only if, given any two 
pieces A; and A; of the partition, either > y for all xe A;, ye A;, or the 
reverse holds. With a little trouble, one can also show that the product of any 
two such partitions is also such a partition. 


_§ There are two troubles with this. First, 0° and °0 would then be ambiguously defined 
(they could be either 0 or 1), and second, (8) would no longer be true. 








S OD VY DReerea ww 











GENERALIZED ARITHMETIC 287 


We note that in the case of ordinal sums, even the order of summation is 
prescribed. With cardinal sums, the representation as a sum of indecomposable 
summands is only unique to within rearrangement of the factors. 


5. Cardinal multiplication. The usual notion of the product of two cardinal 
numbers also has an obvious and known (cf. [1] or [2]) generalization to arbitrary 


”? 


partially ordered sets or, in our terminology, to arbitrary “‘numbers’’. 

DEFINITION. By the cardinal product of two given numbers A and B (written 
AB) is meant the set of all couples (a, b) (a e« A, b e B), where (a, b) = (a’, b’) 
if and only if a = a’ in A and b 2 b’ in B. 


THEOREM 5. Cardinal multiplication is single-valued, isotone, commutative, 
and associative; it admits 1 as an identity, is distributive on cardinal sums, and 
semi-distributive on ordinal sums. Formally, 


(20) A = Bimplies AC = BC and CA = CB; 

(21) A C Bimplies AC C BC and CA CCB; 

(22) A < Bimplies AC < BC andCA < CB; 

(23) AB = BA for all A, B; 

(24) A(BC) = (AB)C for all A, B, C; 

(25) 1A = Al A for all A; 

(26) A(B + C) = AB + AC and (A + B)C = AC + BC; 
(27) A(B @ C) > AB @ AC and (A @ B)C > AC @ BC. 


The verification of each of these laws individually, except (27), is a straight- 
forward and elementary exercise which involves only substituting in definitions 
and using obvious correspondences (ef. [1] for (23), (24), (26)). To prove (27), 
consider the obvious one-one correspondence between A(B @ C) and 
AB @ AC. Each element of either can be identified with a couple (a, b) or 
(a,c) (ae A, be B,ceC). In both, (a, b) 2 (a’, db’) if and only if a 2 a’ and 
b = b’, and similarly for (a,c) 2 (a’,c’). In A(B @ C), we have (a, b) 2 (a’, c) 
if and only if a = a’; in AB @ AC, we have (a, b) 2 (a’, c) identically; these 
give (27). 

To show that equality does not hold in (27), note that, in terms of lattice 
diagrams, we have the situation shown in Fig. 1. As a corollary of (21), (22), 


Z (o@o) = ap AY (F- eo) @(F-e) = 


Fie, 1 


and (25), we get also 








288 GARRETT BIRKHOFF 


(28) A = Al CABandA = Al < AB, forall B. 


This is analogous to (18)-(19) above. 


6. Ordinal or lexicographic multiplication. The usual definition ([3], p. 78) 
of the lexicographic product of two ordered sets applies without change to 
arbitrary “numbers”. When so applied, it specializes not only to the usual 
product of two ordinals, but also (curiously enough) to the product of two 
cardinals, as usually understood. Nevertheless, we shall not regard it as the 
correct generalization of ordinary cardinal multiplication, for the reason that it 
does not, in general, satisfy the identities of cardinal arithmetic. 

Derinition. By the ordinal product of two numbers A and R (in symbols, 
A ° B) is meant the set of all couples (a, b) (a « A, b e B), where (a, b) 2 (a’, b’) 
if and only if a > a’ in A ora = a’in A andb 2 b’ in B. 

In the finite case, we can construct the diagram of A ° B from the diagrams of 
A and B as follows. In each circle representing an element a of A, put a replica 
B, of B. Then draw segments from all the maximal elements of each B, and 
all the minimal elements of each B,, for a’ covering’ a. This rule may be justified 
by the covering condition of 


Lemma l. In A ° B, (a,b) covers (a’, b’) if and only if (i) a = a’ and b covers b’, 
or (ii) b is minimal and b’ maximal in B, while a covers a’ in A. 


THEOREM 6. Ordinal multiplication is single-valued, isotone, and associative; 
it admits 1 as an identity, and is semi-distributive on ordinal and cardinal sums 
alike. Formally, 


(29) A = Bimplies A°C = BoCandCeoA =CoB; 

(30) A C Bimplies A°C CBeCandCeoA CCB; 

(31) A < BimpliesC°A < CoB; 

(32) A°(B°C) = (A°B)°C for all A, B, C; 

(33) 1° A = Aol =A forall A; 

(34) (A ®@ B)eC = AcC @ BeC; 

(35) (A+ B)°eC=A°C+BeCandAc(B+C) <AcB+APC. 


Proof. Rules (29)—-(31) are immediate if (a, c) is made to correspond to 
(b, c) if and only if a corresponds to b; the details are easy to check. The 
proofs of (32)—(33) are also immediate. 

In regard to (34)-(35), we use the cardinal one-one correspondence already 
utilized on proving (26)-(27). First, we have the two equalities. In all cases, 
the couples (a, c) are ordered as in A © C and the couples (b, c) as in B ° C; hence 
we need merely compare couples (a, c) with couples (b, c’). In (34), it is easy 


® By the statement a’ covers a, it is meant that a’ > a, yet a’ > x > a for no z. 














GENERALIZED ARITHMETIC 289 


to show that always, on both sides, (a, c) 2 (b, c’), while in (35), (a, c) and 
(b, c’) are unrelated in both cases. Then, we have the inequality. In both 
cases, the couples (a, b) are ordered as in A © B and the couples (a, c) asin A ° C; 
hence we need merely compare (a, c) with (b, c’). But these are never compar- 
able in Ao B + AoC; hence the correspondence certainly preserves all in- 
equalities. 

Remarks. Ordinal multiplication is thus more closely connected with cardinal 
than with ordinal addition. 

The non-commutativity of ordinal multiplication, known in the infinite case, 
clearly appears also in the finite case. 

One might expect A° (B @ C) and Ac B @ A°C to be related in some 
way. However, if A, B, and C are finite and A is simply ordered, one may 
show by a little numerical computation that both expressions have the same 
number of elements and of relations; hence, if related by any inequality, they 
must be isomorphic. Now set A = the ordinal two (or two-element chain), 
B = 3,and C = 2. Drawing the appropriate diagrams, we get A ° (B @ C) # 


AcCB@ACC. 
O 
Oo O° = fe) 
4M = (So (oo) 
Fig. 2 


Surprisingly enough, A < B need not imply A° C < BoC; cf. Fig. 2. The 
same example shows that A x Bo A is possible (for A = 2, B the two-element 
chain as above). On the other hand, it is an immediate corollary of (34)—(35) 
and the fact that 1 is a subnumber and homonumber of every number, as well 
as an identity for ordinal multiplication, that 


(36) ACA°B,BCAcB,A < A°B, forall A, B. 


7. Cardinal exponentiation. The usual definition (cf. [3], p. 37) of a cardinal 
power of a cardinal number is a special case of a general definition, applicable 
to arbitrary numbers.” This general definition is not suggested by any very 
obvious considerations, but plays an important réle in lattice theory. 

Derinition. By the cardinal power A” of the “base” number A raised to the 
“exponent”’ B is meant the set of all those functions f(z) with domain B and 
range in A, which are isotone in the sense that x = yin B implies f(x) 2 f(y) in A. 
This set is ordered by making f = g in A” mean that f(z) = g(x) in A for all 
re B. 


THEOREM 7. Cardinal exponentiation is single-valued and semi-isotone; it 
satisfies the usual exponentiation laws with respect to the other cardinal operations 
and 1: 


10 This generalization was given in [1]; ef. also [2], p. 13. 





290 GARRETT BIRKHOFF 


(37) A = B implies A° = B® and C* = Cc’. 
(38) A C B implies A° C B*; 

(39) A < B implies C* C C’; 

(40) A**° = A*A%,) 

(41) (AB)° = A‘B*,> for all A, B, C; 

(42) (A*)° = A, | 

(43) A’ = A and 1* = 1 forall A. 


Proof. Law (37) is evident from the abstract nature of the definition of cardi- 
nal exponentiation. Since each isotone function from C to A is a fortiori one 
from C to B if A C B, and the same inclusion relation holds, (38) is also imme- 
diate. Regarding (39), let @ be the given correspondence from B to A. With 
each g in C“ associate f = g¢, defined by the identity f(b) = g(b@). By hypothe- 
sis, b 2 b’ implies b@ = b’6, hence g(b@) = g(b’@) and so f(b) = f(b’): in short, 
every f = g¢isisotone. Finally, if f* = g*d, f(b) = f*(b) for all b if and only if, 
for all a = 66, g(a) = g(b@) = g*(b@) = g*(a), completing the proof. This 
proves (39). Incidentally, if A = C = the ordinal 1 @ 1 and B is the cardinal 
number 2, then we have an example where A < Byet A° = 1011 B° =2. 
The proofs of (40)—(42) are contained in [1]; those of (43) are trivial. This proves 
Theorem 7. 

It is an immediate corollary of (38) and (43) that 


(44) A CA” for all A, B. 


In fact, the left-hand side consists of all constant functions occurring on the 
right-hand side. 


8. Ordinal exponentiation: past methods. It seems to me that the weakest 
point of classical transfinite arithmetic comes when ordinal exponentiation is 
defined. 

Consider Cantor’s original definition ([3], p. 118) 


g=& eae, eth" = sup (e’). 


This inductive definition is essentially non-constructive. It is not even equiva- 
lent to known constructive definitions, unlike the corresponding inductive defini- 
tions of ordinal sums and products 


é+1=é+1, E+ (n+1) = (+) +1, & + (sup 7) = sup(é + »), 
gfl=& &n+1)=ém +8, &sup m) = sup(éy), 


which are equivalent to the constructive definitions of §§4, 6. 
. Second, and this is also not the case with ordinal sums and products, it destroys 
the otherwise perfect homomorphism from ordinal arithmetic to cardinal arith- 








~ 
1i- 











GENERALIZED ARITHMETIC 291 


metic. Thus, in the notation of [3], whereas 2° = w is an equation between or- 
dinals, the corresponding cardinal equation 28° = )®° is false. 

Again, consider Hausdorff’s alternative definition: Y~* is the set of all func- 
tions from X to Y, ordered according to the first non-zero difference. This is 
in close conformity with the corresponding cardinal definition, and sufficiently 
constructive; thus it avoids the defects of Cantor’s definition. But it has a 
peculiar defect all its own: Y*, although a chain, is not usually an ordinal;" 
thus 2° is not an ordinal. To be sure, it gives approximately the order-type 
of the real continuum (actually, that of the Cantor discontinuum), which is 
a very pretty result. But 2” is not even simply ordered, and if we use it as 
an exponent, we get something indescribable. I think it is fair to say that here 
Hausdorff gave up and defined a “partially ordered set’’ as one of those patho- 
logical things which he got by his construction. 

The definition of ordinal exponentiation given below is equivalent to Haus- 
dorff’s for both ordinals and cardinals, is constructive, and has the added ad- 
vantage that, under it, the family of chains (simply ordered sets) will at least 
be closed. 


9. Ordinal exponentiation: new definition. The change in Hausdorff’s 
definition of ordinal exponentiation which I propose is the following. 

DeriniTion. By the ordinal power *Y is meant the set of all functions f: 
y = f(x) from X to Y, where f = g means that to each x with f(x) 2 g(x) cor- 
responds an x’ > x with” f(x’) > g(z’). 

This evidently coincides with Hausdorff’s definition in the case that X is 
an ordinal. Its greatest defect is that, although the relation f = g is reflexive 
and transitive, it is not anti-symmetric: *Y is often’ only a quasi-ordered set 
((2], p. 7). But it is well known (loc. cit., Theorem 1.2) that such a set becomes 
a partially ordered set, if f = g is defined to mean f 2 gandg 2 f. Hence the 
defect is not essential. 


THEOREM 8. Ordinal exponentiation is single-valued, is slightly isotone, satisfies 
the law of addition of exponents with respect to both cardinal and ordinal addition 
and an ordinal semi-associative law of exponentiation. Formally, 


(45) A = B implies “A = °B and “C = "C; 
(46) A C B implies “A C “B; 


11 In a nutshell, ordinals are not closed under Hausdorff’s definition, although cardinals, 
paradoxically, are. If any definition yielded an ordinal (i.e., failed to have this defect) 
of the correct power and was constructive (i.e., did not have the defects of Cantor’s defini- 
tion either), it would constructively well-order the continuum. The difficulty of doing 
this is sufficiently well known (cf. K. Gidel, The Consistency of the Axiom of Choice and of 
the Generalized Continuum-hypothesis, Princeton, 1940). 

12 The idea is that any difference in the values of f and g at x’ dominates the values at 
all points coming afterwards. 

13 Technically, this is the case unless X satisfies the ascending chain condition or Y is 
totally unordered (a cardinal number). 





292 GARRETT BIRKHOFF 


(47) °*°A = 7A°A, | 

(48) °®°4 = 7A “A, for all A, B, C; 
(49) ““"¥c < “(°C), |} 

(50) 'A = Aand*1 =A for all A. 


Proof. Law (45) is evident from the abstract nature of the definition of 
ordinal exponentiation. Again, since each function from C to A is a fortiori 
one from C to B if A C B, and inclusion is defined in the same way, (46) holds. 
Next, we are to prove (47) and (48). The functions f from B + Cor B @C 
to A correspond one-one with the pairs of functions (g, h), one from B to A and 
the other from C to A. In the cardinal case, f = f’ if and only if g = g’ and 
h = h’; in the ordinal case, since a > b for allae A, be B, f 2 f’ if and only if 
g>g'org=g'andh=h’. Therefore, the correspondence defines the asserted 
isomorphism in both cases. Law (50) is trivial. While as for (49), the func- 
tions from A ° B to C assign to each couple (a, b) a value c = f(a, b), hence to 
each fixed a, a function f,(b) from B to C; this is simply the usual one-one 
correspondence from **“C to “("C). But in the first case, f 2 g means that, 
for some (a, b), (i) f(a’, b’) > g(a’, b’) for no a’ > a, regardless of b’, (ii) f(a, b’) > 
g(a, b’) for no b’ > b, and (iii) f(a, b) 2 g(a, b). But now condition (i) implies 
far > Ga for no a’ > a while conditions (ii) and (iii) assert that f. = ga; com- 
bining, f = g in “°’C implies f = g in “(°C). Dualizing, f = g in “(°C) implies 
f2qgin “*2C which was what we wanted to prove. We note that if A and B 
satisfy the ascending chain condition, then the equality holds in (49). 


10. Dualization. In addition to the six binary operations which we have just 
discussed, there is an important unary operation: that of dualization. This is 
trivial for cardinals and ordinals, but is very important in most other cases. 

DEFINITION. By the dual of a number X (in symbols, X*) is meant the num- 
ber obtained from X by replacing the inclusion relation in X by its converse 
(ef. [2], p. 8). 

Graphically, this amounts to turning the diagram of X upside down, i.e., 
to reversing the order in X. 

THEOREM 9. Dualization is single-valued, involutory, and isotone; it is an 
isomorphism for all cardinal operations, a dual isomorphism for ordinal addition 
and multiplication, and a semi-isomorphism on ordinal exponentiation. Formally, 


(51) if X = Y, then X* = Y*, while (X*)* = X; 

(52) if X CY, then X* C Y*; and if X < Y, then X* < Y*; 

(53) (X + Y)* = X* + Y*, (XY)* = X*Y*, and (X")* =X*”"; 
(54) (X @ Y)* = Y* @ X* and (Xo Y)* = Y*o X*; 

(55) ("X)* = "X*. 














GENERALIZED ARITHMETIC 293 


Proof. All of the above results may be obtained by the reader by appealing 
to the appropriate definitions. The peculiar non-duality of ordinal exponentia- 
tion may help to explain the many vagaries of this operation. 


11. Closure properties. We shall define below various special classes of 
numbers, with particular reference to their closure under the various arithmetic 
relations and operations discussed above. For this, it will be convenient to give 
names to certain types of closure which appear repeatedly. 

First, if S is any set with a binary relation p, we shall call a subclass P of S 
hereditary under p, when P contains with any element a, all x such that xpa. 
(For this terminology, cf. C. Kuratowski, Topologie, Warsaw, 1933, p. 29.) 

Next, if S is a set with a binary operation - , we shall call a subclass P of S 
closed under the operation if it contains a - 6 whenever it contains a and b. 
This concept extends also to unary, ternary, and other operations. 

Finally, the subclass P will be called a caste under the given binary operation, 
provided it contains a - b if and only if it contains both a and b. In the language 
of genetics, the property of belonging to P is recessive. 

As examples of recessive properties, we have the following familiar cases: 
(i) homogeneity, for polynomials under multiplication, (ii) case of being a unit 
in a commutative ring (divisor of unity), (iii) case of being primary under a 
given prime ideal, under ideal multiplication. 

A “caste”? thus defines a congruence relation with two equivalence classes, 
P and S — P; and the complement of a caste has the properties of a (multi- 
plicative) prime ideal. What is more important, the property of being a caste 
under a given operation is a “closure property” in the sense of [2] (“‘exten- 
sionally attainable’ in the sense of E. H. Moore). In fact, the caste-closure 
P of a set P is its closure with respect to the operations (i) of including with any 
element, all its ancestors, and (ii) of including with any two elements, their 
product. For instance, if S is a lattice under the binary operation U, the caste- 
closure of any subset is just the ideal generated by that subset. 


12. Kinds of numbers. The following special classes of numbers will be 
considered below: cardinal numbers, ordinal numbers, chains, lattices, complete 
lattices, striated numbers, and finite numbers. 

A cardinal number means a number A such that x 2 y in A implies x = y. 
With any number X may be associated its cardinal number c(X); this is composed 
of the elements of X, but with a new inclusion relation, which is allowed to 
subsist only between identical elements. 

An ordinal number is a number A,. every non-void subset S of which has a 
greatest (first) element s, satisfying s = x for allxeS. This is the usual well- 
ordering condition. A chain is a so-called simply ordered system; a number A 
such that for any x, ye A either x = yory 2 &. 

Thus any ordinal number is a chain, while the only number which is both an 
ordinal and a cardinal is 1, the partially ordered system with a single element. 





294 GARRETT BIRKHOFF 


In applications, a special réle is played by those numbers which are lattices 
in the sense of [2], that is, numbers A in which, given zx and y, there exist a 
smallest element x U y containing both and a largest element zx /M y containing 
both. Numbers such that this is true not only for two-element subsets, but for 
arbitrary subsets, are called complete* lattices. 

By a bounded number will be meant a number having a least element o and 
a greatest element 7, such that o S x S 7forallz. By a striated number will 
be meant a number A which satisfies the Jordan-Dedekind chain condition, 
in the sense of [2]. More precisely, A will be called striated if and only if each 
xz eA has a numerical dimension d[z] which is a non-negative integer assuming 
bounded values, and is such that if x covers y, then d[z] = d[y] + 1. By the 
length d{A] of a striated number A is meant the maximum length of a chain 
in A; this is d{i] — d[o] if A is bounded. 

Finally, the notion of a finite number will be understood in the obvious way; 
A is finite if and only if its cardinal number c(A) is finite, that is, if and only 
if there is no one-one correspondence between A and a proper subset of itself. 


13. Closure properties of classes of numbers. One can represent in concise 
tabular form most of the closure properties of the classes of numbers just defined, 
with respect to the relations and operations of generalized arithmetic. The rows 
of the table represent the different classes of numbers, and the columns the rela- 
tions and operations of arithmetic. Thus the entry in the 7-th row and j-th 
column describes the closure properties of the i-th class of numbers with respect 
to the j-th arithmetic operation or relation. 


THEeorEM 10. The following table of closure properties is correct. 
e < + @ " O x’ YY * 


ES eee ere eee H H R OR R BD BDC 
I ae i aa gh RS aces 8 H H O RO R OY 
ETT rae iF - SS F - ae a 
Sa ee oe eee OCR BD BD @ 
Complete lattice............... O C R R BD BHC 
CS Se I ae ree H O C R R BD BDC 
Serer tT ree R RRR BD BY’ C 
Sa a OF Sf SS 2 fe a ae 


Explanation. BD (base dominant) means resultant is in subset if and only 
if base X is. BD’ means resultant is in subset if and only if base is, and exponent 
is finite. C means subset is closed under operation (if operation is unary *, 
this is really the same as R). H means property is hereditary (supra). O 
means resultant of operation never has property; O’ means resultant never has 
property unless one factor is 1; O’’ means power never has property unless base 
is 1 or base is two-element chain and exponent has property. R means property 
is recessive (subset in question is caste); R’ means property is recessive unless 
base is 1. 








a a or 











GENERALIZED ARITHMETIC 295 


Proof. In almost all cases, the truth of the assertions indicated by the entries 
is well known or easily verified. We shall only sketch proofs in a few exceptional 
cases. 

For instance, consider the assertion that X” is a lattice (complete lattice) 
if and only if X is. The sufficiency follows since the subset X” of isotone 
functions is a (closed) sublattice of X°“” ; the necessity follows since, if X is not a 
lattice, the constant functions in X” corresponding to meetless or joinless sets 
of elements of X are meetless respectively joinless in X”. 

Again, consider the closure property of complete lattices under ordinal multi- 
plication. It is easily shown that 

sup(@e, ba) = (sup da, sup ba), 
aeéeo 
where a is the (possibly void) set of alla witha, = supa,. From this it follows 
easily that complete lattices form a caste under ordinal multiplication. 

The closure of chains under ordinal exponentiation is easy to prove. For 
f = g unless for some zx, f(x) 2 g(x), which is the same as f(x) < g(x) if X is 
a chain, while f(z’) = g(x’) for all x’ > x. But in this case, if Y is a chain, it 
is easy to show that f < g. 

The author has been unable to prove that the property of being a complete 
lattice is base dominant under ordinal exponentiation without assuming the 
ascending chain condition for the exponent Y. In this case, we can define 
g = sup f. as follows. Forany y « Y, let A, denote the subset of all a such that 
faly’) < g(y’) for some y’«Y; recursively, we define g(y) = sup fa(y) 
for all a¢ Ay. 

The other non-obvious cases concern striated lattices. The closure properties 
under addition are obvious, and 


(56) d{[A + B] = sup (d[{A], d[B)); d{[A @ B| = d{[A] + d[B]. 


Under multiplication, we have (a, b) covering (a’, b’) in AB if and only if a = a’ 
while 6 covers b’ or vice versa. Lemma 1 of §6 takes care of the ordinal case. 
Together, they give the closure properties of striated numbers, and 


(57) d{[AB] = d{[A] + d[B] and d[Ao B] = d{Al]d[B]}. 
Similarly, under exponentiation, one can verify that 
(58) d{A®] = d[A]c(B). 


The case of ordinal exponentiation is much more complicated, and would take 
up more space than it is worth. 


14. Special closure properties of lattices. The closure rules for lattices under 
ordinal multiplication and exponentiation are so curious as to deserve special 
mention. 





296 GARRETT BIRKHOFF 


THEorREM 11. The ordinal product L ° M of two lattices is a lattice if and only 
if L is simply ordered or M is bounded. 

Proof. Clearly (x, y) U (2’, y’) must be (z, y), (x’, y’), (x, yU y’), or (x U 2’, 0), 
according as x > 2’,x < 2’,2 = 2’, orxand 2’ are incomparable. These con- 
ditions can always be fulfilled if and only if M has an o or the fourth case never 
arises, i.e., if L is a chain. Dualizing, we get the assertion of the theorem. 

Since no /-group is bounded except the trivial /-group having a single ele- 
ment, we get the 

Coro.tiary. The ordinal product G ° H of two l-groups is an l-group if and 
only if G is simply ordered. 

TuroreM 12. For *Y to be a lattice, it is necessary and sufficient that one of the 
following three conditions hold: (i) Y is a lattice and X a cardinal number, (ii) 
Y is a bounded lattice, (iii) Y is a chain and X a semi-root. 


Explanation. A “semi-root” is a partially ordered set in which the elements 
above each fixed element form a chain. 

First, Y must be a lattice, or else not even all pairs of constant functions will 
have joins and meets. If Y isa lattice, then since 


- ° re . Xr 7X 
(59) if X is a cardinal number, then “Y = Y’, 


it is sufficient that X be a cardinal number, by Theorem 10. Again, if Y is a 
bounded lattice, then *Y is a lattice. The only problem is in case f(x) and g(x) 
are incomparable. In this case, call a critical value of x one such that f(x’) = g(x’) 
for all x’ > x, whereas f(x) ¥ g(x). Set h(x) = f(x) U g(x) at all critical values, 
and h(t) = o at all points less than critical values. Repeating this process on 
(h,f) = hi, (ti, g) = he, «++ , and using transfinite induction (this is a “sweep- 
ing-down process” for critical values), we will arrive ultimately at f U g, which 
thus exists. Every h; is contained in all upper bounds to f and g. 

There remains the case that Y is not bounded and X is not a cardinal number. 
In this case, Y must be a chain, or we could choose x > 2x’ with f(x) and g(x) 
incomparable but f(t) = g(t) for all t > x, whence if h = f U g, we would have 
h(x’) = o, and dually, contradicting the assumption that Y was not bounded. 
Further, unless X is a semi-root, we can find x’ > x, 2” > 2, with 2’, x” incom- 
parable. Again, since Y is not bounded, Y ~ 1; hence we can choose f and g 
such that f(t) = g(t) for all t # 2’, x’, f(x’) > g(x’), and f(x”) < g(x’). Again, 
if h = f U g, clearly h(x) = 0; forming f NM g similarly, Y would have 
to be bounded, contrary to hypothesis. Hence Y is a chain and X is a semi-root. 
The sufficiency of these conditions is easy. Given f and g, form h = f U g by 
making A(x) = f(x) unless z is below a critical value ¢ at which f(t) < g(t); at 
such points, set h(x) = g(z). 

14 This result was implicitly conjectured by Mr. J. C. Abbott while doing graduate work 
at Harvard University in 1938-1939. 

For the notion of /-group, cf. the author’s Lattice-ordered groups, Annals of Mathematics, 
vol. 43(1942), pp. 298-331. 








at 


cs, 





GENERALIZED ARITHMETIC 297 


Just as before, we have the 


. —— Sep - ‘ ‘ , s 
Coroutuary. If Y is an l-group, then ~Y is an l-group if and only if X is a 
cardinal number, or X is a semi-root and Y is simply ordered. 


15. Cardinals and ordinals: special properties. There are various special 
properties of ordinal and cardinal numbers which should be mentioned, if only 
because so many of them hold for more general classes of numbers. 

First we note the following more or less trivial properties of the function 
c(A). We have 


(60) c(A + B) = c(A @ B) = e(A) + e(B); 
(61) c(AB) = c(A° B) = c(A)c(B); 

(62) A < c(A) for all A; 

(63) A* C A™ and 7A < “A, 

Then we have the counterpart of (59), 

(64) if A is a cardinal, then Ao B = AB. 


The most important special properties of ordinal and cardinal are the following 
anti-symmetric and comparability laws. 


THEOREM 13. If A and B are both cardinals or both ordinals, then 
(65) 4 C Band B CA imply A = B; 
(66) ether A C BorB CA. 


These laws are well known. We shall see later that the anti-symmetry law 
is valid also for finite numbers. We may also note without proof that if A 
and B are both cardinals or both finite ordinals, then 


(67) A < Bif and only if A CB. 
An interesting partial extension of this result is the fact that if B is a chain, then 
(68) A < B implies A C B. 


To see this, suppose we take a single representative b(a) from the antecedents 
of each-a e A; the correspondence b(a) — a will then be one-one and preserve 
order; hence it will be an isomorphism. 

Another special property of finite ordinals is seen in the commutative laws 


(9) A @®B=B@Q@AandAcB= BeA, 


which are valid for these numbers. We have further” 


15 Cf. S. Sherman, Some new properties of transfinite ordinals, Bulletin of the American 
Mathematical Society, vol. 47(1941), pp. 111-116. 








298 GARRETT BIRKHOFF 


(70) Ae (B @ C) C (Ac B) @ (A° OC), for any ordinals. 

Other important special laws for cardinals are the following converses of (18). 
THEeorEM 14. Jf A and B are both cardinals, then 

(71) A C Bimplies A = Bor A + X = B for some X; 

if A and B are both ordinals, then 

(72) A C Bimplies A = Bor A @ X = B for some X. 


In somewhat the same vein, we may note that if B D A, and A, B are ordinals, 
then either B = Q° Aor B= Q° A @R(R < A) for some unique Q, R (right 
division algorithm). This fact enables one to develop a factorization theory for 
finite and infinite ordinals (cf. {3]). 


16. Finite numbers: special properties. When we come to finite numbers, 
we find first that the anti-symmetric laws hold. More precisely, we have 

THEOREM 15. Jf A and B are finite numbers, then 
(65) A C Band B CA imply A = B, and 
(73) A < Band B < A imply A = B. 

Proof. Law (65) is trivial, since c(A) = c(B). As for (73), first note that 
since c(A) S c(B) and conversely, A and B have equally many elements. Hence 
the homomorphisms are one-one. A similar argument shows that they must 
leave the number of ordered couples (x, y) such that x = y invariant. Hence 
they are isomorphisms. 

More interesting is the study of cancellation laws. These are lost in ordinary 
transfinite cardinal arithmetic, and half lost in transfinite ordinal arithmetic. 
The results stated below put them in their true setting. 

THEOREM 16. Jf A is any finite number, then 
(74) A+ X =A + Y implies X = Y. 

If A satisfies the ascending chain condition, then 

(75) A ® X = A @ Y implies X = Y, 

and dually, if A satisfies the descending chain condition, 
(75’') X ® A = Y @A implies X = Y. 

Proof. As for (74), this follows from the unique decomposition theorem for 
cardinal addition; an explicit proof can also be given. As for (75)-(75’), by 
duality, we need only prove (75). If (75) is not true, we can assume (by sym- 
metry) that under the given isomorphism @ from A @ X to A @ Y, a@eY for 


some ae A. But this means that for y = a0, a@ > y@' = a, sincea > y 
in A @ Y and @" is an isomorphism. It follows that A would have to possess 








18). 


als, 
ght 
for 


ers, 


ave 


hat 
nce 
ust 
nce 


ary 
tic. 








GENERALIZED ARITHMETIC 299 
an infinite ascending chain a < a@” < (a@")* < --- < a0" < --- , contrary 
to hypothesis. 


Coro.tuary. The cancellation law (75) is valid for ordinals. 


The author has not attempted to generalize the one-sided cancellation law 
for ordinal multiplication to all numbers which satisfy the ascending chain con- 
dition, but conjectures that this is possible. 

If X is finite, or satisfies the ascending chain condition, then *Y can be de- 
fined to consist of all functions y = f(x) from X into Y, where f = g means that 
for some (maximal) x, f(x) 2 g(x), while for all x’ > x, f(x) = g(x). Using this 
definition, we can prove the associative law of exponentiation: 


(76) ““"c = “(°C), if A and B satisfy the ascending chain condition. 


Indeed, the usual one-one correspondence subsists between the elements 
c = f(a, b) of “°“C and those c = f.(b) of “(°C). In the former, f = g means 
that, for some a, b, (i) f(a’, b’) = g(a’, b’) for all a’ > a and (i’) f(a, b’) = g(a, b’) 
for all b’ > b, while (ii) f(a, b) 2 g(a, b). In the latter, f = g means that, for 
some @, (i) fa’ = Qa’ for all a’ > a, while (iii) fa 2 ga. But (iii) means in turn 
that, for some b, (i’) fa(b’) = ga(b’) for all b’ > b, while (ii) fa(b) = ga(b). The 
isomorphism can now be read off from the equivalence between the two forms 
of (i), (i’), (ii). 


17. Bounded numbers. Bounded numbers also have a number of special 
properties not true of all numbers. In the first place, we can state the following 
simple results: 

(77) A<A@B, B<A®@B) 

} for bounded numbers. 

(78) A < Bimplies A ° C < BoC) 

The reader should have no trouble in proving (77). To prove (78), let @ map 
B on A as assumed, and let S(a) denote the set of b e B such that b@ = a. In 
each S(a), we can (by finite or transfinite induction) choose a maximal non-void 
set 7'(a) of incomparable elements—elements such that x > y for no z, y « T(a). 
Relative to T(a), each element b of S(a) will fall into just one of the following 
categories: (i) b > b’ for some b’ « T(a), (ii) b € T(a), (iii) b < b’ for some b’ ¢ T(a). 
Now map (b, c) on (b@, 7) of A ° C in case (i), on (08, c) in case (ii), and on (08, 0) 
in case (iii). It may be checked that if b > b’, then the image of (b, c) contains 
that of any (b’, c’) whether b@ > b’@ or b@ = b’8; also, that of (b, c) obviously 
contains that of (b, c’) for any c’ S c. 

THEOREM 17. Any two decompositions of a bounded number into cardinal 
factors have a common refinement."® 


Coro.uary 1. Any finite bounded number can be factored uniquely into in- 
decomposable (‘‘prime’’) factors. 


16 This result is proved in [2], Theorem 2.9. 





300 GARRETT BIRKHOFF 


This implies the following extension of the cancellation law for the multiplica- 
tion of finite cardinals. 


Corotiary 2. AX = AY implies X = Y if X, Y, A are finite and bounded. 


‘ 


This result holds more generally if AX and AY have a finite ‘‘center” in the 
sense of [2]. It suggests the conjectures that if AX and AY are finite, then 
AX C AY may imply X C Y, and possibly even AX < AY may imply X < Y. 

The author knows no example which would prove that the boundedness con- 
dition was irredundant in Corollaries 1-2 above. But in (78), if C is the cardinal 
two, A is one, and B is the ordinal two, we have A < ByetC = AC x BoC; 
hence the boundedness condition is not redundant. 

Regarding cancellation laws, the author conjectures that the following law is 
valid for finite numbers: A‘ = B° implies A = B if A, B, C are finite. In 
the case C is a cardinal number, it follows if a unique factorization theorem is 
known (e.g., if A, B are bounded). In general, it is not even known whether 
A’ = B’ implies A = B; this has been conjectured by S. Ulam for general 
abstract systems. It is certain that C* = C”’ does not imply A = B; thus if 
C is a cardinal number, then C* = C for all lattices, and indeed for all numbers 
A not representable as cardinal sums. 


18. Special interpretations with lattices. In the case of lattices, it is natural 
to replace the relation A C B by the stronger relation A C *B, meaning that A 
is a sublattice of B. Similarly, it is natural to replace the relation A < B by the 
stronger relation A < *B, meaning that A is a lattice-homomorphic image of B. 
We have the following cross-connection: 


(79) A < *Bimplies A C Bif Bisa finite lattice. 


For, the correspondence a — sup b is an isomorphism. If B is distributive, 
bé@=a 
we even have A C *B. 
Many of the isotonicity and duality laws (ef. (10), (11), (18), (19), (21), (22), 
(52), ete.) proved above for the relations C and < are valid also for the stronger 
relations C* and <*. In particular, we have 


(80) A C B implies C* < *C’. 


At least in case C is the ordinal two, we have (see Garrett Birkhoff, Rings of sets, 
this Journal, vol. 3(1937), p. 454) the curious counterpart that A < B implies 
c* c *C*—and in fact, that all sublattices of C’ may be obtained in this way. 

We also have (and in the finite case this follows from laws (79) and (80)) 


(81) A C B implies that C* C C’ if C is complete. 
Proof. Given f from A to C, define g = f@ by 
g(b) = sup f(a) for any beB. 
asb 








ve, 


2), 


zer 











GENERALIZED ARITHMETIC 301 


Clearly g is isotone; and, if f is isotone, makes g(b) = f(b) for allbe A. Hence 
g is an extension of f, and the correspondence f — g is one-one; it is evidently 
isotone. 

The author has no counterexample to (81) for numbers not complete lattices, 
but the possibility of extension does not exist in all cases. 


19. Applications. We now come to the main argument in favor of a broader 
attitude towards the six arithmetic operations and dualization: the fact that in 
the wider context of general partially ordered sets, many new applications of 
these operations are found. Let us study this situation, first as to the cardinal 
operations, and then as to the ordinal operations. 

For the sake of comparison, it should first be stressed that the cardinal opera- 
tions really have very few applications in traditional transfinite arithmetic.” 
The operations of addition and multiplication are actually trivial, since the sum 
of any two infinite numbers, like their product, is simply the larger of the two 
summands (multiplicands). The remaining operation, that of exponentiation, 
is primarily useful in constructing from Np» the only known infinite cardinals, 
including c = 2®°, 

In contrast to this slim array of classical applications, there are known at 
least nine distinct applications of our extended cardinal arithmetic to questions 
of lattice theory,” especially to the theory of Boolean algebras and distributive 
lattices. Besides, they may be applied to topology ([2], p. 15). If M and N 
are any two abstract complexes, then M + N represents their topological sum 
and MN their topological (or “‘Cartesian”’) product. While if M represents any 
subdivision of a manifold without boundary, then 1/* represents the dual of 
the subdivision. 

Again, it is fair to ask, what applications of the ordinal operations of tradi- 
tional transfinite arithmetic are known? Transfinite induction should not be 
included, as it does not involve addition, multiplication, or exponentiation.” 
The operations of addition and multiplication are primarily useful in that they 
afford a neat notation for countable ordinals; the same is true of Cantor’s 
inductively defined operation of exponentiation. These ordinals are used in 
such places as the Baire classification of functions. Also, with Hausdorff’s 
exponentiation operation, 2° gives the Cantor discontinuum. 

Equally important, it seems to me, are the uses of these operations to partially 
ordered systems which are not well-ordered. In the first place, ordinal multi- 


17 The distinction between No and c, although fundamental in modern analysis, is in- 
dependent of the operations of cardinal arithmetic. 

18 These are listed in [1], §3, and will not be repeated here. They mainly concern expo- 
nentiation, although the operation of cardinal multiplication (forming the direct product) 
is also of fundamental importance in lattice theory. 

19 It would be a mistake to minimize the importance of transfinite induction. However, 
it should be noted that transfinite sequences are being replaced by directed sets in topology 
(cf. J. W. Tukey’s Convergence and Uniformity in Topology, Princeton, 1940), and that trans- 
finite induction is being replaced by the Lemma of Zorn in many other connections. 





302 GARRETT BIRKHOFF 


plication is useful in the construction and description of non-Archimedean 
ordered groups,” which are so basic in modern valuation theory. Again, the 
most general vector lattice with finite basis can be described as *R, where R is 
the real number system and X is the most general “semi-root’”’.” Further, 
the lattice of l-ideals of *R is B*, where B denotes the ordinal two, thus establish- 
ing a curious connection between cardinal and ordinal exponentiation. Finally, 
the most general vector lattice can be built up from R by repeated cardinal and 
ordinal multiplication, to form such vector lattices as (R ° (RR)) (Re R). 


BIBLIOGRAPHY 


1. G. Brrxuorr, An extended arithmetic, this Journal, vol. 3(1937), pp. 311-316. 
. G. Brrxnorr, Lattice Theory, New York, 1940. 

3. F. Hausporrr, Grundziige der Mengenlehre, first edition, Leipzig, 1914. 

4. M. H. Stone, The Theory of Real Functions, Ann Arbor, Michigan, 1940. 


to 


w 


HARVARD UNIVERSITY. 


20 Cf., for example, H. Hahn, Ueber die nichtarchimedische Grassensysteme, S.-B. Wiener 
Akad., Math.-Nat. Klasse, Abt. Ila, vol. 116(1907), pp. 601-653. 

21 For the facts stated here, cf. the author’s paper Lattice-ordered groups and the doc- 
toral thesis of Mr. Murray Mannos. 








ner 


loc- 











MAXIMAL FIELDS WITH VALUATIONS 
By Irvine KAPLANSKY 


1. Introduction. A field with a valuation is said to be maximal if it possesses 
no proper immediate extensions, i.e., if every extension of the field must enlarge 
either the value group or the residue class field. This definition is due to F. K. 
Schmidt, but was first published by Krull ({4], p. 191). In the same paper 
Krull succeeded in proving that any field with a valuation possesses at least 
one immediate maximal extension and that any field of formal power series is 
maximal in its natural valuation. These facts led Krull to propound the fol- 
lowing two queries. 

(1) Is the immediate maximal extension of a field uniquely determined? 

(2) If a maximal field K has the same characteristic as its residue class field, 
is K necessarily a power series field? 

These two closely related questions form the central problem of this investi- 
gation. The answer to the first is obtained in §3 (Theorem 5), as follows. The 
immediate maximal extension is always unique if the residue class field has 
characteristic ~ ; but if the latter has characteristic p, then a pair of conditions 
which we have labelled “hypothesis A’ must be satisfied. It is then not diffi- 
cult to obtain the answer to the second question in §4. In fact, with the same 
hypothesis, the answer is again affirmative, provided factor sets are admitted 
in the construction of the power series field (Theorem 6). Granted an addi- 
tional hypothesis, it is furthermore possible to dispense with factor sets 
(Theorem 8). In §5, examples are given to show that the conclusions of the 
preceding theorems may fail if hypothesis A is not fulfilled. 

The notion of pseudo-convergence, borrowed from Ostrowski ([9], p. 368), 
appears to be a natural tool for investigations of maximality, and it is employed 
consistently throughout the paper. The reason for this is to be found in Theo- 
rem 4, which shows that pseudo-convergence provides us with an intrinsic 
characterization of maximality. 


2. Pseudo-convergence and maximality. Throughout this section K will 
always denote a field with a valuation V on an ordered Abelian group I, B its 
valuation ring, and & its residue class field.’ 

DeFtniTion. A well-ordered set {a,} of elements of K, without a last element, 
is said to be pseudo-convergent if 


(1) V(a, a a,) < V(a, ag a) 


Received November 3, 1941; presented to the American Mathematical Society, February 
22,1941. The author wishes to express his thanks to Professor MacLane for his assistance 
in the preparation of this paper. 

1 For these definitions, cf. [4] and [7]. 


303 





304 IRVING KAPLANSKY 


forallp <0 <r. 
Lemma 1.’ If {a,} is pseudo-convergent, then either 
(i) V(a,) < V(a,) for all p < a, or 
(ii) V(a,) = V(a,) from some point on, i.e., for all p, ¢ = some ordinal X. 
Proof. Suppose that (i) does no+ hold, i.e., that V(a,) 2 V(a,) for some 
p<o. Then V(a,) must equal V(a,) forallr > o. For, if not, we would have 


V(a, — a,) = min [V(a,), V(a,)] S V(a,), 
while V(a, — a,) 2 V(a,), so that the inequality (1) could not possibly hold. 
LemMa 2. If |a,} is pseudo-convergent, then V(a, — a,) = V(a,4: — a,) for all 
p<o. 


Proof. We may assume ¢ > p + 1. From the inequality 
V(a,41 — a) < V(a, — 4,41), 
and the identity 
Mg — A, = (Gg — G41) + (G,41 — G,), 
we deduce that 
Via, — a,) = min [V(a, — a,41), V(a,41 — a,)] 
= V(a,41 — a). 


As a consequence of Lemma 2, we can unambiguously introduce the abbrevia- 
tion y, for V(a, — a,) (oe < ¢). We note that {y,} is a monotone increasing set 
of elements of T. 

Derinition.’ An element z of K is said to be a limit of the pseudo-convergent 
set {a,} if V(x — a,) = y, for all p. 

DeriniTion. The set of all elements y of K such that Vy > vy, for all p 
forms an (integral or fractional) ideal in the valuation ring B; this ideal we 
call the breadth’ of {a,}. 

The limit of a pseudo-convergent set is by no means unique; however, given 
one limit, it is easy to describe the totality of limits. 

Lemma 3. Let {a,} be pseudo-convergent with breadth A, and let x be a limit 
of {a,}. Then an element is a limit of {a,} if and only if it is of the form x + y, 
with y « U. 

Proof. If zis any other limit, it follows from 


t—z= (te —a,) — ( —4,) 


2 Cf. [9], p. 368. The inequalities here read in the opposite sense because Ostrowski 
uses an exponential valuation. 

3 Cf. [9], p. 369. 

4 This definition does not always coincide with Ostrowski’s, [9], p. 375. 

5 A translation of ‘‘Breite’’, [9], p. 368. 








-_—e ae 


f 


all 


a- 


on 


Y; 


ki 











MAXIMAL FIELDS WITH VALUATIONS 305 


that V(x — z) > v, for all p, whence x — z liesin &. Conversely, if y « A, then 
Vy > vy = V(x — a,), 
and so V(x + y — a,) = y,, whence x + y is a limit of {a,}. 

Let the field L be an extension of K, with a valuation which is an extension 
of V. If the value group and residue class field of L coincide with T and &, 
respectively, we say that L is an immediate extension of K. If K admits no 
proper immediate extensions, K is said to be maximal. It, will now be our 
object to prove that maximality is equivalent to the possession of a limit by 
every pseudo-convergent set; half of this equivalence is obtained in the following 
theorem. 

THEOREM 1. Let L be an immediate extension of K. Then any element in L 
but not in K is a limit of a pseudo-convergent set of elements of K, without a limit in K. 

Proof® Let z be an element in L but not in K, and let S denote the totality 
of values V(z — a), with ain K. Certainly S does not include the symbol ~. 
Further, S cannot have a greatest member y. For, suppose V(z — g) = y, 
g « K; let ce K have value y, and let de K be a representative of the residue 
class of (zg — g)/e. Then V(z — g — cd) > y, where g + cd e K, a contradiction. 

From the set S select a well-ordered cofinal subset’ {a,}; since S has no 
greatest member, {a,} cannot have a last term. Choose elements a, ¢ K with 
V(z — a,) = a. The identity 


a, — a, = (2 — a,) — (2 — a), 

together with the inequality 

V(z — a,) < V(z — a,) (oe <9), 
then imply 
(2) V(a, — a,) = V(z — a,) (oe < 9), 
whence {a,} is pseudo-convergent with z as limit. 

Suppose that {a,} had the further limit z,;in K. Then, by Lemma 3, 

V(z — a) > V(a, — a,) (op <o). 
Combining this with (2) and using the fact that {a,} has no last member, we 
obtain 

: Viz — a) > V(z —a,) = a, 

for all p; and this is a contradiction, since {a,} is cofinal in S. 


Next, we must show that if some pseudo-convergent set {a,} in K lacks a 
limit, then K is not maximal. This will be accomplished by adjoining to K a 


6 It is perhaps worth remarking that Theorem 1 and the preceding lemmas do not depend 
on the commutativity of either & or I. 
7 [3], p. 129. 





306 IRVING KAPLANSKY 


limit of {a,} and then proving that the resulting extension is immediate. Since 
we shall later be interested in questions of uniqueness, Theorems 2 and 3 will 
also include some preliminary results on uniqueness. 

We borrow from Ostrowski the following two lemmas ([9], p. 371, IV and III). 
His proofs are readily adapted for the more general case under considera- 
tion here. 


Lemma 4. Let 8; ,--- , Bm be any elements of an ordered Abelian group 1, and 
further let {y,} be a well-ordered, monotone increasing set of elements of T, without 
a last element. Let t,, --+ , tm be distinct positive integers. Then there will exist 


an ordinal up and an integer k (1 S k S m) such that 
B; + tp > Bi + lip 


for alli # k and p > u. 

LemMa 5. If {a,} is pseudo-convergent in K, and f(x) is a polynomial with 
coefficients in K, then {f(a,)} is ultimately pseudo-convergent.° 

By combining Lemmas 1 and 5 we can make a useful deduction concerning 
the set {| Vf(a,)}, namely that for all sufficiently large p and o either 


(3) Vf(a,) = Vf(as) 
or 
(4) Vf(a,) < Vf(ae) (p <a) 


must hold. The distinction between these two cases will persist throughout the 
discussion, and for convenience we introduce the following definitions. 

Derinitions. A pseudo-convergent set {a,} in K is said to be of transcendental 
type (with respect to K) if (3) holds for every polynomial f(x) with coefficients 
in K; if, on the other hand, (4) holds for at least one polynomial f(x), we shall 
say that {a,} is of algebraic type. 

THEOREM 2. If there is a pseudo-convergent set {a,} of transcendental type in K, 
without a limit in K, then there exists an immediate transcendental extension K(z) 
of K. The valuation of K(z) can be specifically defined as follows: for any poly- 
nomial f(z) with coefficients in K we define Vf(z) to be the fixed value which Vf(a,) 
ultimately assumes. In the resulting valuation, K(z) is an immediate extension 
of K, and z is a limit of {a,}. 

Conversely, if K(u) is any extension of K, with a valuation which is an extension 
of V such that u is a limit of {a,}, then K(u) and K(z) are analytically equivalent 
over K.” 


It is to be noted that Ostrowski’s pseudo-convergence need only hold from 
some point on. 

® By an analytic equivalence over K we mean a value preserving isomorphism which 
is the identity on K. 








nce 
will 


iI). 


ra- 


ind 
out 
cist 


ith 


ng 








MAXIMAL FIELDS WITH VALUATIONS 307 


Proof. We must first verify that the above definition actually defines a 
valuation of K(z), i.e., we must show that 


(5) V[g(z)h(z)] = Vg(z) + Va(z) 
and 
(6) Vig(z) + h(z)] 2 min [Vg(z), VA(z)] 


for all rational functions g(z) and h(z). But the truth of (5) and (6) follows at 
once from the truth of the corresponding equations with z replaced by a, . 

Next, we wish to show that, with this valuation, K(z) is an immediate exten- 
sion of K. By definition, Vf(z) = Vf(a,) for large p, so there is clearly no ex- 
tension of the value group. To prove the same for the residue class field, it 
will suffice to take any polynomial f(z) with Vf(z) = 0, and find an element 
be K with V[f(z) — b] > 0, for then f(z) and b will lie in the same residue class. 
Since {f(a,)} is ultimately pseudo-convergent, we have 


Vif(a,) — f(ae)] > VIf(ac) — f(a,)] 2 0 
for sufficiently large p < « < 7. But, by definition, 
Vif(2) — f(ae)] = VIf(a,) — f(ae)] 
for large +. Therefore, V[f(z) — f(a.)] > 0 so that f(z) and f(a,) lie in the 
same residue class. 
To show that z is a limit of {a,}, we observe that 
(7) V(z — a,) = V(a, — a,) (p< °) 
for large p. An application of Lemma 1 to the pseudo-convergent set {z — a,} 
yields that {V(z — a,)} is monotone increasing. Then from the identity 


(zg — a,) — (¢ — a.) = a, — a,, 


we obtain (7) for all p. 

It remains to prove the final statement of Theorem 2. Regardless of the 
characteristic of K, it is possible to form a Taylor expansion for a polynomial 
f(u) of degree m: 


(8) f(u) i f(a) = (u - ay) fi(Ap) $s > (u - ay) fm(p), 


where f,(u) may be thought of as replacing the formal expression f“ (u)/k!. 
(See, for example, [1], p. 165, Ex. 2, or [2].) By hypothesis it is possible to cut 


into {a,} so far that the values of f{a,}, fi(a,), --- , fm(a,) are all independent 
of p. We shall suppose that this has been done, and let us write 8; for Vf;(a,) 
(¢ = 1, ---,m). We apply Lemma 4 with ¢; = 7 (¢ = 1, --- , m) andy, = 
V(u — a,). Since 


B; + ty, = V[(u — a,)'f.(a,)], 


10 Similar results are proved in [4], p. 194 and [9], p. 374. 





308 IRVING KAPLANSKY 


it follows that for sufficiently large p some one of the terms 
(u — a,)'fi(a,) (¢ = 1, +++, m) 


has smaller value than all the others. This means that the value of the right 
member of (8) increases monotonically with p for large p. Since Vf(a,) is fixed, 
this is possible only if Vf(u) = Vf(a,) for large p, which in turn implies that 
Vf(u) = Vf(z). We have obtained an explicit analytical equivalence over K 
between the fields K(u) and K(z). 

THEOREM 3. If {a,} is a pseudo-convergent set of algebraic type in K, without 
a limit in K, then there exists an immediate algebraic extension K(z) of K, which 
can be explicitly obtained as follows. Among the polynomials f(x) for which (A) 
holds, choose one of least degree n—say q(x). Let z be a root of q(x) = 0, and for 
any polynomial f(z) of degree less than n, define Vf(z) to be the fixed value which 
Vf(a,) ultimately assumes. In the resulting valuation, K(z) is an immediate 
extension of K, and z is a limit of {a,}. 

Conversely, if u is a root of q(x) = 0, and if K(u) has a valuation which is an 
extension of V such that u is a limit of {a,}, then K(u) and K(z) are analytically 
equivalent over K. 

Proof. First, it is necessary to remark that the polynomial q(x) is irreducible 
and of degree 2 2. For, if g(x) = b(x — c), then V(c — a,) increases mono- 
tonically for large p. But |¢c — a,} is pseudo-convergent; by Lemma 1, V(c — a,) 
increases monotonically for all p, whence it follows that 


Vice — a,) = V(a, — a,) (p < @), 


and ¢ is a limit of {a,}, contrary to hypothesis. Again, if g(x) = qi(x)q(z), 
where q; and q are polynomials of degree less than n, then V[q:(a,)q2(a,)] in- 
creases monotonically for large p; the same must, therefore, hold for either 
Vala,) or Vq2(a,), contradicting the minimal choice of q(x). 

The remainder of the proof, with one exception, is a duplication of the proof 
of Theorem 2, the discussion being, of course, confined to polynomials of degree 
less than n. The one exceptional point is the proof of the multiplicative char- 
acter of the valuation of K(z), and this proof we shall now give. 

K(z) consists of polynomials in z of degree less than n, with coefficients in K. 
The product h(z) of two such polynomials f(z) and g(z) is defined by an equation 
of the form 


S(z)g(z) = h(z) + k(z)q(z). 
We have, for all p, 
(9) S(a,)g(a,) — h(a,) = k(a,)q(a,). 


Now, for large p, the value of the right member of (9) increases monotonically, 
while V[f(a,)g(a,)] and Vh(a,) are fixed for large p. This is possible only if 


Vh(a,) = V{f(a,)9(a,)] 








ae fi ah 








MAXIMAL FIELDS WITH VALUATIONS 309 


for large p, whence, by the definition of V on K(z), 
Va(z) = Vf(z) + Vog(z), 
as desired. 
Upon combining Theorems 1, 2, and 3, we obtain 
THEOREM 4. A field with a valuation is maximal if and only if it contains a 
limit for each of its pseudo-convergent sets. 


3. Uniqueness of the maximal extension. It was proved by Krull ([4], 
Th. 24, p. 191) that any field with a valuation possesses at least one immediate 
maximal extension. It is natural to inquire whether this extension is uniquely 
determined. More precisely, if N and N’ are two immediate maximal ex- 
tensions of K, we ask whether there exists between N and N’ an analytical 
equivalence over K. 

It is first of all clear from Theorems 2 and 3 that N or N’ can be obtained 
from K by a transfinite series of adjunctions of limits of pseudo-convergent sets. 
If we can demonstrate that each of these adjunctions takes place in a unique 
fashion, we shall have obtained an affirmative answer to the question of unique- 
ness. In the case of transcendental pseudo-convergent sets, uniqueness is 
already assured us by Theorem 2. It only remains to examine sets of algebraic 
type, and here, as will appear, uniqueness can indeed fail. 

By the use of Theorem 3, we can reformulate the question as follows. Sup- 
pose the pseudo-convergent set {a,} is of algebraic type in K, and let q(x) be 
a polynomial of least degree such that Vq(a,) ultimately increases monotonically. 
Let N be any immediate maximal extension of K. 

(*) Does N contain a limit of {a,} which is also a root of g(x) = 0? 

It is now clear that the answer to the question of the uniqueness of the maximal 
extension hinges entirely on the answer to the question (*). 

We shall adopt the following fixed notation for the discussion. Let the 
degree of g be n. Let q; denote the 7-th formal derivative of g. Cut into {a,} 
sufficiently far so that Vq;:(a,) (¢ = 1, --- , ) is independent of p, equal, say, 
to 8;." Denote V(a, — a,) (9 < ¢) by y,. Finally, let &, the residue class 
field of K, have characteristic p. We treat explicitly only the case where p 
is finite, but as a matter of fact the proof can be read equally well for the case 
p = ~«;/it is only necessary to replace throughout all powers of p by unity. 

First, we prove a simple number-theoretic lemma. 


Lemma 6. If p is prime, and r is a positive integer prime to p, r > 1, then 
t 
(7") is prime to p, for any integer t = 0. 


Proof. 





(?") - pr(p'r — 1) +++ (p'r — p' +1) 
p pi(p'— 1) +++ 1 ; 


11 The fact that some of the 6’s may be infinite does not vitiate any of the arguments. 





310 IRVING KAPLANSKY 


In the numerator of this fraction, the first factor is divisible by precisely p‘, 
while the remaining ones are not divisible by p‘. Hence, for every factor m 
occurring in the numerator, the factor m — p‘(r — 1) which occurs in the 
denominator will be divisible by p to precisely the same power. This gives 
the desired result. 

Lemma 7. Ifi = p',j = p'r withr > 1, (r, p) = 1, then 

Bi + ty, < By + I% 
for all sufficiently large p. 

Proof. We form a Taylor expansion for q;(a,). In doing so it is necessary 

to introduce certain binomial coefficients ({2], p. 226). 


qi(az) — gi(a,) = (¢ +1)(a, — a,)gias(a,) + --- 


+ (?) (a, = a,)*q;(a,) ore (") (a, 77 a)" “qn (dp). 


Consider the right member of (10) with p < «. By Lemma 4, there will be 
among these terms precisely one of least value, provided p is sufficiently large. 
The value of this term must then equal the value of the left member of (10), 
which in turn is not less than 6;. It follows that the term 


(:) (a, _ ay)’ ‘gj(4p), 


occurring in (10), must also have value not less than 6;. But, by Lemma 6, 
(?) has value zero. Therefore, 
Bs (j iv 1)Yp + B; 


and the result follows at once from the fact that {y,} is monotone increasing: 


(10) 


Lemma 8. There is an integer h, which is a power of p, such that for all suf- 
ficiently large p 


(11) Bs + ty > Br + hy, (¢ # h) 
and 
(12) Vq(a,) = Ba + hy. 
Proof. Consider 
(13) (az) = g(a) + (a, — a,)qi(a) + +++ + (Ge — O)"Qn(ay) 


with p < o. Applying Lemma 4 to the terms on the right of (13) other than 
q(a,), we find that for large p there is precisely one of them, say (a, — a,)"gr(a,), 
of least value; this proves (11). Now, keeping p fixed and varying o, we observe 
that Vq(a,) increases monotonically. This is possible only if 
Vq(a,) = Vi(a, — a,)"qn(a,)] 
= Br + hy, . 


That h is a power of p is an immediate consequence of Lemma 7. 








in 


vs 


f 








MAXIMAL FIELDS WITH VALUATIONS 311 


Throughout the succeeding discussion, we shall reserve the letter h for the 
integer occurring in Lemma 8. 


Lemma 9. If y is a limit of {a,}, then 


(14) Vaq(y) > Br + hy, 

for all p, and 

(15) Vaily) = Bi (¢ = 1, «++, m). 
Proof. We have 

(16) q(y) = g(a) + (y — a) g(a) + --- + (y — @)"qn(a,), 


where, by definition, V(y — a,) = y,. By Lemma 8, the terms of least value 
on the right of (16) are q(a,) and (y — a,)"q,(a,). Therefore, Vq(y) = Veg(a,), 
and, in view of the monotone increasing character of {y,}, we obtain (14). 
To prove (15) we form the Taylor expansion 


(17) ge(y) — g(a) = @ + IY — G)as(G) + +++ + (") (y — 4%)" “qn(qy). 


By Lemma 4, the value of the right member of (17) increases for large p; hence, 


Vaily) = Vqila,). 
The following result is not required in the present connection, but, for con- 


‘enience, the proof will be given at this point. 


Lemma 10. Suppose the value group T is Archimedean, and suppose further 
that the breadth of {a,} ts not the zero ideal. Let 


q*(x) = Do (x — ay)*gi(ag), 


i= p™ 
the summation ranging as indicated only over powers of p. Then for sufficiently 
large 0, and p > 8, 


(18) Vq*(a,) = Vala.) = Br + hy. 

Proof. Suppose 7 is not a power of p, and let k be the highest power of p 
dividing 7. By Lemma 7, for large p, 
(19) Bi + ty > Be + ky. 
The hypothesis that the breadth of {a,} is not the zero ideal means that the 
real numbers y, approach a finite limit S. From (19), a fortiori, 

Bi t+ iS > Be + ks. 

It follows that we can choose a fixed uz so large that 
(20) Bi + t%% > Be + ky, 


for all p. Thus, for every 7 not a power of p, there is a corresponding y» such 
that (20) holds. Let @ be any ordinal exceeding all these »’s. Then 


(21) B; + a7) > By + ky, Pa Br + hy, 





312 IRVING KAPLANSKY 


holds for every i not a power of p. Next, we write 
n 


(22) q(a,) = >> (a, — ay)‘ gi(ay). 


i=0 


Since, by Lemma 7, Vq(a,) = 8, + hy,, it follows from (21) that we can eject 
from (22) the terms for which 7 is not a power of p without altering the value 
of the right hand member; this vields (18) at once. 

We are now able to obtain our principal result on the uniqueness of the im- 
mediate maximal extension. In the event that the residue class field ® has 
finite characteristic p, the requisite hypothesis is contained in the following 
two statements: 

(1) Any equation of the form 


n a 
we aye tte + Ont” + Ont + Anyi = 0, 


with coefficients in ®, has a root in &.” 

(2) The value group I satisfies [ = pT. 
For convenience, we shall refer to this pair of conditions as “hypothesis A’’. 
If the characteristic of & is infinite, we shall further agree that hypothesis A 
is vacuous. The proof of the following theorem can then be read for this case 
in the light of our previous remark that all p-th powers are to be replaced 
by unity.” 


1 


TueoreM 5. Let the field K have a valuation with value group T and residue 
class field R, such that R and T satisfy hypothesis A. Then the immediate maximal 
extension of K is uniquely determined up to analytical equivalence over K. 


Proof. As we remarked above, it follows from Theorems 2 and 3 that we 
need only prove the following statement: if {a,} is pseudo-convergent of alge- 
braic type in K with q(x) for a minimal polynomial, and if N is any immediate 
maximal extension of K, then N contains a limit of {a,} which is also a root 
of q(x) = 0. This is done by a transfinite approximation, for which purpose 
it is convenient first to prove the following lemma. (The symbols y, , 8; and 
h are used as defined above.) 


Lemma 11. If for some limit t « N of {a,} we have Vq(t) = a, we can obtain 
the better approximation Vq(t*) > a, where t* ¢ N is a limit of {a,} such that 


V(t* — t) = max (a — 8;)/Z, 


i= p™ 
i ranging as indicated over the powers of p (1 S i S n). 


12 Concerning this rather unusual hypothesis we wish to remark that it definitely falls 
short of algebraic closure. An example is provided by taking the Galois field of p elements 
(p > 2) and closing it off with respect to extensions of odd degree. 

3 In fact, the whole discussion could be greatly shortened if we were interested in this 
case only. In particular, the transfinite approximation in Theorem 5 could be replaced 
by a simple application of the Hensel-Rychlik theorem. 








— eee eed alee 


ils 
ts 


is 
od 








MAXIMAL FIELDS WITH VALUATIONS 313 


Proof. Write 


(23) 56 = max (a — §;)/1, 
the range of 7 being the powers of p, as it will be throughout the proof. Taking 
i = A in (23) and using Lemma 9, we obtain 


(24) ’ 5 > (6B, + hy, aid Br)/h = 


for allp. Let ke N be any element of value 6. (This is possible, as hypothesis 
A impliesé¢T.) For any ze WN, 


(25) q(t + kz)/q(t) = p> k’ 2’ q,(t)/q(t). 


In the polynomial (25) the coefficient of z’ has value js + 8; — a. Ifjisa power 
of p, we have 


(26) ji+ Bs —aZz0 
by (23). If J is not a power of p, and 7 is the highest power of p dividing j, 
then 

fo+Bis-ar>tvbé+B-—a2z0 


by Lemma 7, (24) and (26). Taking these facts together, we observe that if 
we replace each coefficient in (25) by its residue class, we obtain a polynomial 
with coefficients in &, say F(z), of precisely the type used in hypothesis A. 
Hence, & contains a root 2, of F(z) = 0. If z ¢ N is any representative of the 
residue class Z, , we then have V[q(¢ + kz)/q(t)] > O or Vq(t + kz) > a. Also, 
by the choice of k, V(kz:) = 6. By (24), kz: lies in the breadth of {a,}, whence 
by Lemma 3, ¢ + kz, is a limit of {a,}. With the choice of (* = ¢ + ku, we 
have therefore proved Lemma 11. 

We now resume the proof of Theorem 5. We are going to select a trans- 
finite set of elements {t,} of N such that: 

(1) each ¢, is a limit of {a,}; 

(2) if V(t.) = a, then a, < a, (u < »); 

(3) V(t, — ¢,) = max (a, — 8;)/i (u < v), the range of i again being the 
powers of p. 

Let us first observe that the proof of Theorem 5 can then be easily completed; 
for, a cardinal number consideration shows that the choice of the ¢’s must ter- 
minaté with the appearance of an element ¢; « N, which is a limit of {a,}, and 
for which Vq(t;) = ~, or g(t) = 0. For ¢,, we choose any limit of {a,} in N 
(there is at least one by Theorem 4), and suppose ¢, has been chosen for all 
nu < X80 as to satisfy (1), (2), and (3) foru < » < X. 

(i) \ a limit number. Then (2) and (3) imply 


Vit, — th) < Vite — 4) (u<v<9< hd), 


showing that {¢,},<, is pseudo-convergent. Let 4 ¢« N be any limit of {t},a. 





314 IRVING KAPLANSKY 


As an immediate consequence of the definition of limit, we have 
(27) V(q — t,) = max (a, — B;)/2 (u <A). 


From (27) and Lemma 9, as in (24), we have V(4 — ¢,) > y,, so that h — ¢, 
is in the breadth of {a,}, and 4 is a limit of {a,} by Lemma 3. Finally, we 
must prove 


(28) Vals) > ay (u < ). 
We write 
(29) q(ty) = q(t,) + (t, 7 ty)qu(t,) — > (t, - tu) "Qn(ty)- 


For j a power of p, by Lemma 9, 
(30) Vi(t = t)°q i(t,)] == B; + J max (a anc B;)/1 a Oy ; 
while for 7 not a power of p, (30) follows a fortiori from Lemma 7. Applying 
these facts to (29), we obtain Vq(4) = a, , which suffices to prove (28). 

(ii) \ not a limit number. Here 4_; is given, and by Lemma 11, we can find 
a limit 4& of {a,} such that 


(31) Va(h) > m1 
and 
(32) Vit — ba) = max (mu — 6;)/i. 


From (31) and (32), respectively, (28) and (27) readily follow. With this the 
induction is complete. 


4. The structure of maximal fields. The results obtained in §3 will now 
enable us to obtain explicit theorems on the structure of maximal fields and 
their representations as power series fields. 

If R is any field and [ is any ordered Abelian group, the set of all formal 
series 

> a,t” (a, eR, a, € T, {a,} well-ordered) 


form a field, when addition and multiplication are defined in the usual formal 
fashion. This field we may denote by R(t").* In R(t") we can introduce a 
valuation V by setting 

V(>> at”) = a (a, ¥ 0). 


Krull has proved that in this valuation &(¢") is maximal ((4], p. 193). 

We now pose the converse query: is a maximal field AK, with value group I 
and residue class field &, analytically isomorphic to R(t")? As the first step 
in obtaining such a representation, we must find a subfield M of K which can 
serve as the coefficient field. Let H denote the homomorphism mapping every 


4 For power series fields cf. [4], [8], [10]3 








ing 


ind 


the 
OW 
und 
nal 


ed) 


nal 
ea 








MAXIMAL FIELDS WITH VALUATIONS 315 


a e K with Va 2 0 into its residue class, and mapping every a e K with Va < 0 
into «.” Then the property we desire for M is represented by the equation 
H(M) = &. 

LemMa 12. Let K have the same characteristic as its residue class field &, 
and suppose K is algebraically perfect, and satisfies the Hensel-Rychlik theorem.”* 
Then K possesses a subfield M with H(M) = &. 


Proof. The proof requires only a slight amplification of Lemma 2, [7]. If 
P and $ denote the prime subfields of K and &, respectively, we necessarily 
have H(P) = $. We build up M by successive adjunctions as in MacLane’s 
proof, and the only point that needs further investigation is the case of an 
inseparable algebraic extension. Suppose then that we have H(N) = M&M, 
and we wish to obtain an extension of N corresponding to N(a@"’”), @eN. Let 
ae N be the representative of @. By hypothesis a” ¢ K, and N(a””) provides 
the desired extension. 

Next we must obtain a set of elements playing the réle of the elements {t*} 
in a power series field. It will, however, in general be necessary to admit a 
factor set for the multiplication of these elements. 


Lemma 13. With the same hypothesis as in Lemma 12, and with a fixed choice 
of the field M, there exists a set {t*} in K, with Vt“ = a@ for every a € T, and with 


c“ = es” (Cag € M), 


where Ca,g 1s a factor set. 


Proof. By well-known methods, it is possible to choose a rationally inde- 
pendent basis {f,} for I’, ie., a set {f,} such that every a e I has a unique 
representation as a sum of ¢’s with rational coefficients. For & we choose any 
element of value ¢; and if @ is a sum of ¢’s with integral coefficients, we choose 
for ¢* the product of the corresponding elements ¢ raised to the appropriate 
powers. Suppose finally that a is of the form 


a = moi + mY + Tata 


with not all the r; integral. Let r be the L.C.M. of the denominators of 7, 
- , Tm; then ¢ has already been assigned. We shall now show that it is 
possible to select an element ¢* such that 


(33) , (t*)’ = af* (ae M). 


Since K is perfect, it suffices to take the case (r, p) = 1. Let z e K have value 
a, and let a be the M-representative of 2’/t'*. Then at’*/z’ lies in the same 
residue class as 1, and, by the Hensel-Rychlik theorem, has an r-th root in K. 


1% A detailed statement of the connection between H and V is given in [7]. 
16 In [4], p. 178, this theorem is proved on the hypothesis of completeness, a weaker 


condition than maximality. 








316 IRVING KAPLANSKY 


Therefore, at’* also has an r-th root, and this is our choice for ¢*. For any 
other 8 ¢« T we will have the similar equations: 


(34) (*)’ = ot” (b eM), 
(35) (+8) ctt’at®) (c eM). 
From (33), (34), and (35) we obtain 

(ee? /e***)"™ = a™b™/c™ « M. 


Since M is a coefficient field, it follows from this that ¢*t°/t*** eM, as desired. 
Before we can obtain our structure theorem, it will be necessary to prove 
the following result. 


Lemma 14. Let N be a maximal field of characteristic p, with value group TI, 
and residue class field R, and suppose R is perfectand T = pl. Then N is perfect. 


Proof. Suppose that, on the contrary, a e N has no p-th root in N. We 
construct the extension M = N(a’”). Because N is complete in Krull’s sense, 
the valuation of N extends to M in the following unique manner.” Any b 
in M but not in N will satisfy an irreducible equation of the form 


2? +e27'+--- +¢e,=0 (c; € N) 


and to b we assign the value V(c,)/p. Plainly this involves no extension of I: 
Furthermore, since any element in M is a p-th root of some element in N, it 
follows readily that there is no extension of R. Hence M is an immediate 
extension of N, contrary to the hypothesis that N is maximal. 

The corresponding lemma for the characteristic unequal case will also be 
needed, but here a stronger hypothesis must be made. 


Lemma 15. Let N be maximal and of infinite characteristic, while the residue 
class field R has characteristic p, and suppose T and R satisfy hypothesis A (ef. 
Theorem 5). Then every element of N has a p-th root in N. 


Proof. We employ a transfinite approximation. Since this is carried out 
in virtually the same fashion as in Theorem 5, we shall here merely summarize 


the method. 
First, it suffices to prove a” « N for Va = 0. We make an inductive choice 


of elements {t,}, with V(t? — a) = a,, such that 
ap < ae (p < a); 
V(t, — t,) = max (a,/p, a, — Vp) (ep <a). 


For ¢; we choose any element of N with V(t? — a) > 0. (There is such an ele- 
ment, since hypothesis A implies that the residue class of a has a p-th root.) 
Suppose we have chosen ¢, for all p < A. If \ is not a limit number, it follows 
just as in Lemma 11 that there exists 4 with V(t? — a) > m1, and V(4 — 4.1) = 
max (a~:/p, m-1 — Vp). If d is a limit number, then {t,},-, is pseudo-con- 


17 [4], p. 180. 








MAXIMAL FIELDS WITH VALUATIONS 317 


vergent; for 4 we then choose any limit of {¢,},-,. When the approximation 
terminates, we obtain a p-th root of a in N, as desired. 

We are now able to prove our first structure theorem. For brevity let us 
denote by &(t", cas) the power series field in which multiplication takes place 
according to the rule: t*? = ca,st***(ca.3 € &). 


THEOREM 6. Let the maximal field N, with value group T and residue class 
field R, have the same characteristic as K;and suppose R and T satisfy hypothesis A. 
Then N is analytically isomorphic to a power series field R(t" , ca,3). 


Proof. By Lemma 14, N is algebraically perfect. We are then able to apply 
Lemmas 12 and 13, obtaining a coefficient field M which we may identify 
with and a set of representatives {u*} with 


B 


utu® = cagu**? (Cap € R). 


Form the subfield K of N obtained by adjoining to & all the elements u“, and 
let K’ denote the analogous subfield of &(t", ca,s), i.e., the field obtained by 
adjoining to & all elements ¢*. Let 7 be the natural map of K’ on K, i.e., 
T is the identity on R and Tt* = u*. Plainly 7 is a homomorphism; but, 
moreover, 7' preserves values, and so must actually be an analytic isomorphism. 
Now N and &(t", cas) are immediate maximal extensions of K and K’, respec- 
tively. By Theorem 5 and our hypothesis, N and &(t", cas) are analytically 
isomorphic. 

It is natural to inquire in what circumstances the factor set occurring in 
Theorem 6 can be dispensed with. For this purpose we require an extension 
theorem of wider scope than Theorem 5. The investigation also yields a 
uniqueness theorem for the characteristic unequal case, as will appear in Theo- 
rem 8. 


TuroreM 7." Let the field K have value group T and residue class field &, 
and let the two maximal extensions L and L’ of K have value group A and residue 
class field 2, which may be proper extensions of T and R. Then if A and <% satisfy 
hypothesis A, and if every element of 2 has an n-th root in % for all n, L and L’ are 
analytically equivalent over K. 


Proof. It will suffice to prove that LZ and L’ contain analytically equivalent 
subfields N and N’ with value group A and residue class field %; for then L and 
L’ are analytically equivalent by Theorem 5. We will build up N and N’ 
through a transfinite succession of fields paralleling adjunctions in °%/R and 
A/T, and it suffices to consider the case of a single adjunction. 

Residue class field adjunction. A’ transcendental or separable algebraic 
extension is handled exactly as in [6], Theorem 3; to treat an inseparable alge- 
braic extension, we observe that, by hypothesis and Lemmas 14 and 15, every 
element in L has a p-th root in L; so we are again able to use MacLane’s method. 

Value group adjunction. Consider an extension ['(a) of [. If @ is rationally 


18 This is the non-discrete analogue of [6], Theorem 3, and [5]. 











318 IRVING KAPLANSKY 


independent of T, simply let a « L, a’ e L’ be any elements of value a. Then 
for any polynomial f(x) = >> e:x' with coefficients in K we have 


(36) Vf(a) = Vf(a’) = min V(e,a'), 


showing that A(a) and K(a’) are analytically equivalent. If @ is rationally 
dependent on IT, we have na = 6 for some Be TI. Let b ¢€ K have value 8; 
we wish to show that b has an n-th root in L. Using Lemmas 14 and 15 we 
reduce the consideration to the case (n, p) = 1. Let ze JZ have valuea. By 
hypothesis, the residue class of b/z" has an n-th root in L. By the Hensel- 
Rychlik theorem, b/z" has an n-th root in L, whence b has an n-th root, say a, 
in L. Likewise b has an n-th root a’ in L’. Then (36) holds for polynomials of 
degree less than n, showing again that K(a) and K(a’) are analytically equiv- 
alent. : 

We now obtain our second structure theorem. 

THEOREM 8. Let the maximal field K have value group T and residue class field 
R, and suppose that T and R satisfy hypothesis A, and that every element of & 
has an n-th root in R, for alln. Then K is uniquely determined, up to analytic 
isomorphism by R, I, its characteristic, and in the characteristic unequal case, Vp. 


Proof. Let P be the prime subfield of K. The valuation of P is uniquely 
determined by the given data, up to analytic isomorphism. Theorem 8 then 
follows from Theorem 7. 


Coro.uuary. In the equal characteristic case, every field with a valuation is 
analytically isomorphic io a subfield of a suitable power series field. 


5. Counter-examples. We will show by an example that without hypothesis 
A the conclusion of Theorem 5 may fail.” 

Let & be a field of characteristic p, and let T be the additive group of all 
rational numbers. Let K be the subfield of the power series field &(t") obtained 
by adjoining to & all the elements ¢*; K is then the field of all quotients of 
linear expressions in the ?’s, with coefficients in R. Consider the pseudo- 
convergent sequence {a;}: 


a; = po? ” ps Bese om, 


We wish first to show that {a;} is of transcendental type in K. Now the breadth 
of {a;} is in fact precisely the valuation ring of K; at any rate, it is not the 
zero ideal. If {a,;} were of algebraic type in K, then by Lemma 10 there would 
exist elements c; « K (j = 0, --- , m + 1) such that the value of 


n n—l 
z= aa? + ear +++ + Cpa? + Crdi + Cn4 


increases monotonically for large 7. We can suppose without loss of generality 
that Ve; = 0 (0 Sj Sn), Ve; = O for at least one j in the same range, and 


19 The principle upon which this example is constructed is the same as in the first of the 


counter-examples of [7! 





len 








MAXIMAL FIELDS WITH VALUATIONS 319 


further that all the c’s are actually polynomials in the #’s. We now imagine 2; 
multiplied out in full, and consider the portion of z; consisting of terms ¢* with 
a < 0; call this portion w;. Suppose c; begins with the term d; « & (d; may be 
zero); then the contribution of c;a?"’ to w; will consist of dja?" ’ together 
with some other terms, the latter being fixed for large 7. Moreover, for different 
j’s, the contributions dja?" ’ are elementwise distinct, at any rate if p ¥ 2. 
It must then be the case that, for large 7, w; will once and for all contain a fixed 
term of least value, and this statement is incompatible with the previous as- 
sertion that Vz; increases monotonically for large 7. 
Now suppose that hypothesis A is violated because the equation 


(37) g(x) = 2" + by” + +++ + dae = |, 
with coefficients in R, has no root in R. The formal series 
a= fl? + | ie +4 F ia of. 1mm 


is a limit of {a;}, and likewise g(a) is a limit of {g(a;)}, which along with {a;} 
is of transcendental type in K. By Lemma 3, g(a) + b is also a limit of {g(a;)}. 
Hence, by Theorem 2, g(a) and g(a) + b are both transcendental over K and 
the mapping g(a) — g(a) + b provides an analytic automorphism of the field 
K{g(a)] = L, say. The adjunction of a to L, which is plainly immediate, can 
be paralleled by the corresponding adjunction of a root a’ of the equation 


(38) g(z) = g(a) +b. 


Let N and N’ be any immediate maximal extensions of L(a) and L(a’), respec- 
tively. Then there cannot exist any analytic isomorphism between N and N’ 
which leaves L elementwise fixed. For, if there were such an isomorphism, 
then N’, like N, would contain a root of g(x) = g(a). But N’ already contains 
a root of (38). Hence N’ would contain a root of (37); any such root would 
necessarily have value zero, and, taking residue classes, we would obtain a root 
of (37) in &, contrary to hypothesis. 

We have thus shown that the violation of hypothesis A may entail the exis- 
tence of inequivalent maximal extensions. But this example still leaves another 
question unanswered, for it might nevertheless be true that all the immediate 
maximal extensions of the above field K are analytically isomorphic to &(t'). 
(This actually occurs in the discrete finite rank case; MacLane [7] gives examples 
of non-unique extensions, but Schilling [10] has proved that all such fields are 
power series fields.) However, in the non-discrete case which we are consider- 
ing, uniqueness fails even in this broader sense. To show this, a somewhat 
more complicated example is needed. 

We shall use the same notation as in the preceding example and in addition 
the abbreviations p" = q and ¢'””” = w,. Suppose that hypothesis A is 
violated, this time by the fact that the element b ¢ & has no p-th root in &. 
We adjoin to the field K the following elements in turn: a, (a + bt)? = uw, 





320 IRVING KAPLANSKY 


(u, + bw)? = ue, +++, (un + bw,)"” = ungr, ete. To assign a valuation 
to these extensions, we argue as before. First, we find that 
(39) ul = at bt + bt? +... + perl, 


Since a is a limit of {a;}, so is u’ by Lemma 3. Hence, u, is a limit of {a;/*} 
and, again by Lemma 3, so is u, + bw,. Also {a;'*}, along with {a,}, is of 
transcendental type in K. By Theorem 2, the mapping u, + bw, — a‘” pro- 
vides an analytical equivalence over K between the fields K(u,) and K(a’). 
Then the extension K(un+4:) of K(u,) can be given a valuation paralleling that 
of K(a’’”*), and in this valuation K(u,4:) is an immediate extension of K(u,). 

Let N be any immediate maximal extension of the field K(a, uw, we, -*-). 
We shall prove that N is not analytically isomorphic to a power series field. 

First we need the following elementary observation. If in a power series 
field M of characteristic p we have a pseudo-convergent set {a,} such that 
each a, has a p”-th root in M, then M contains some limit of {a,} which also 
has a p”-th root in M. To prove this it suffices to construct the power series 
y which agrees with a, for all terms of value less than V(a,,; — a,). Then y 
is a limit of {a,}, and, since p”-th root extraction goes termwise, y has a p”-th 
root in M. 

Now, in our case, a; has a p”-th root in N for all m. If N were a power 
series field, it would, therefore, contain a limit z of {a;}, with a p”-th root in 
N for all m. By Lemma 3, V(z — a) 2 0. Write z = a +c + 2%, where 
c eR, Vz, > 0, say, for definiteness, Vz; > 1/2". Nowa+cec+2 isa p"-th 
(or g-th) power in N; together with (39) this implies that ¢ — b”?t'"" + 
terms of higher value is a g-th power in N. This means that the residue class 
of c has a g-th root, whence, since & is a coefficient field, c has a g-th root in &. 
Subtracting c, and repeating the argument, we obtain that b*’”” has a q-th root 
in &, i.e., b has a p-th root in R, contrary to our initial assumption. 

Remark. This example is easily duplicated if, instead of &, it is T that is 
imperfect, i.e., if [ # pl. But the author has not succeeded in constructing 
a counter-example on the assumption that the general equation (37) lacks a 
root. Thus the possibility remains open that a weaker condition than hy- 
pothesis A will suffice to ensure that a maximal field in the equal characteristic 
case is a power series. 


BIBLIOGRAPHY 


1. A. A. AtBert, Modern Higher Algebra, Chicago, 1937. 

2. H. Hasse anv F. K. Scumipt, Noch eine Begriindung der Theorie der hiheren Differen- 
tialquotienten, Journal fiir Mathematik, vol. 177(1937), pp. 215-237. 

3. F. Hausporrr, Mengenlehre, first edition, Berlin, 1914. 

4. W. Kru, Allgemeine Bewertungstheorie, Journal fiir Mathematik, vol. 167(1932), 
pp. 160-196. 

5. S. MacLang, Note on the relative structure of p-adic fields, Annals of Mathematics, vol. 
41(1940), pp. 751-753. 








ion 


ren- 


132), 


vol. 








~I 


MAXIMAL FIELDS WITH VALUATIONS 321 


. 8. MacLaneg, Subfields and automorphism groups of p-adic fields, Annals of Mathema- 


tics, vol. 40(1939), pp. 423-442. 


. 8S. MacLane, The uniqueness of the power series representation of certain fields with 


valuations, Annals of Mathematics, vol. 39(1938), pp. 370-382. 


. S. MacLane, The universality of formal power series fields, Bulletin of the American 


Mathematical Society, vol. 45(1939), pp. 888-890. 
A. Ostrowski, Untersuchungen zur arithmetischen Theorie der Kérper, Mathematische 
Zeitschrift, vol. 39(1935), pp. 269-404. 


. O. F. G. ScuttiinG, Arithmetic in fields of formal power series in several variables, An- 


nals of Mathematics, vol. 38(1937), pp. 551-576. 


HARVARD UNIVERSITY. 





ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 
By Neat H. McCoy 


1. Introduction. A considerable part of recent work in the theory of matrices 
has been devoted to the study of matrices with elements in some domain more 
general than the field of complex numbers, which is so prominent in early papers 
on this subject. By restricting the domain to be a suitably chosen field or ring, 
different parts of the classical theory have been generalized, or analogues found, 
in various ways. In the case in which the elements are from a non-commutative 
ring, two quite different approaches to the subject have been used. In one of 
these, there is no attempt to introduce the concept of determinant, but suffi- 
ciently strong divisibility conditions are assumed in order to carry over cer- 
tain parts of the theory. In the other approach, which is used in this paper, 
the class of matrices considered is restricted in such a way that determinants, 
having many of the familiar properties of ordinary determinants, can be defined 
for the matrices under consideration. Reference here can be made to the work 
of E. H. Moore ({6],' Chap. IT) on Hermitian matrices in what he calls a “number 
system of type B”. As the present investigation was inspired by this work of 
Moore or, more precisely, by the similarity between his theorems and known 
theorems about arbitrary matrices over a commutative ring, we pause to 
describe briefly the class of matrices to which Moore’s theory is directly 
applicable. 

For the moment, let T be a ring with unit element 1, in which the equation 
2x = 1 has a unique solution and in which there is defined an anti-automorphism 
or involution a — 4G. Thus 


a+b=a+5, ab = ba, & = a. 


We require also that the elements a of T such that a = 4, the so-called sym- 
metric elements, shall be in the center of T. Such a ring may be called an 
involutorial ring. If A = (a,;) is a square matrix with elements in T, and 
a,;; = a; , then A is said to be a Hermitian matrix. Now a “number system 
of type B’’, as defined by Moore, is a special instance of an involutorial ring 
and is in fact either a commutative field, of characteristic other than 2, with 
a = Gd, or a quadratic field over the field of symmetric elements or a generalized 
quaternion algebra over this field. However, Jacobson has pointed out in [2] 
that Moore’s definition of determinant and many of his results remain valid for 
Hermitian matrices over any involutorial ring. 

Although many of Moore’s theorems coincide in statement with known theo- 
rems about matrices over a commutative ring, the published proofs are quite 
different. We shall begin by showing how to unify these two cases, at least to a 


Received November 8, 1941. 
1 Numbers in square brackets refer to the bibliography at the end of the paper. 
322 


‘ 











NG 


ices 
ore 
ers 
ing, 
nd, 
tive 
e of 
iffi- 
cer- 
per, 
nts, 
ned 
ork 
iber 
< of 
wn 
» to 
ctly 


tion 
ism 


ym- 

an 
and 
tem 
ring 
vith 
ized 
. [2] 
for 


ne0- 
uite 
toa 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 323 


certain extent. If ® is an arbitrary ring with unit element, we consider not the 
class ®, of all matrices of order n over KR but a certain subclass MN’, of MN, con- 
sisting of those matrices of Jt, whose elements satisfy certain weakened commu- 
tativity relations to be defined precisely in §2. It is for elements of K. that we 
shall be able to carry over much of Moore’s theory. It will be found that if ® 
is a commutative ring, then 2, = %, , while if R is an involutorial ring, Ri’, 
contains all Hermitian matrices of %,. Thus, both these cases appear as special 
instances of the general theory. 

In §2, we present the notation to be used and discuss the class 9%, in some 
detail. In the next section we introduce, following Moore, the concept of 
adj A. If A is a Hermitian matrix over an involutorial ring, it follows at once 
from the definition that adj A is also Hermitian. It is only in extending this 
result to elements of 3’, that we need to make any essential extension of Moore’s 
work. However, we shall show in Theorem 2 that if A ¢ 9’, , then adj A e Re. 
The proof is rather long and detailed, but it is necessary to establish the theo- 
rem in order to make use of the concept of determinant of adj A. Further 
properties of adj A and related results are presented in §4. Up to this point, 
our work consists roughly of showing that much of Moore’s theory is valid for 
elements of 9;,. The rest of the paper is not so directly a generalization of 
Moore’s work. 

Let A be a fixed element of ®, , and [A] the ring of polynomials in the in- 
determinate \, with coefficients in R. The set of all elements 


g(A) = a + adr + +++ + Gnd” 
of [A] such that 
g(A) = a + aA +--- +a,A” = 0 


is a left ideal m, in RA]. In §5, we shall show how to characterize this ideal 
under the assumption that A is an element of Ri. Naturally, the ideal m, 
defined in an analogous way is also considered. If € is the center of KR, the 
ideal m in G[A] of those elements h(A) of C[A] such that h(A) = 0 is of primary 
importance. This ideal m we call the minimum ideal of A to emphasize the 
fact that, in case ® is a field, it is the principal ideal generated by the minimum 
function of A (ef. [5]). An important property of the minimum ideal of an 
element of 9;, is also to be found in §5. Then, in §6, we show that a recent 
theorem of Ostrowski [7] can be generalized to the case under discussion. This 
theorenr furnishes, for a set of matrices, a partial analogue of the notion of 
minimum ideal of a single matrix.” 

We are unable to establish for arbitrary elements of 9, many theorems 
known to be true in the commutative case. The remainder of the paper is 
devoted to the proof of several of these theorems under further strong restric- 
tions. We introduce in §7, in a natural way, the notion of a quaternion ring 


? For the case in which RF is a commutative ring, this result has already been obtained 
in [4]. 





324 NEAL H. McCOY 


and thereafter limit ourselves either to the case of Hermitian matrices over a 
quaternion ring or to arbitrary matrices over a commutative ring. The theorems 
obtained in §8 are largely concerned with polynomials in a matrix, and are 
theorems which have already been established in [3] and [5] for the commuta- 


tive case. 


2. Notation and general remarks. The following notation will be used 
throughout. By % we denote an arbitrary ring with unit element 1 and center GC. 
The complete matric ring of order n over R or € will be denoted by ®, or C, 
respectively. We shall identify ® with a subring of %, so that in place of al, , 
where ae and /7, is the unit element of %, , we shall merely write a. If 
Ae®,, we shall denote by G[A] the subring of 9%, generated by A together 
with elements of €. Thus the elements of G[A] are the polynomials in A 
with coefficients from GC. 


Now let A = (a;;) be an element of ®,. If ¢ is a set of distinct indices in 
the range 1, 2, --- ,m and f is an element of ¢, we define 


S.;(A) = >» (—1)"Gsr, Oh ry *** Ah, h,Dnyt » 


summed over all permutations h; , he, --- ,h, of elements of o other than f. 
By 9, we shall denote the class of all elements A of ®, with the following two 
properties: 

I. If o is any set of distinct indices in the range 1, 2, --- ,n and f and g are 
arbitrary elements of o, then 


S,,j(A) = 89(A). 


Il. For each o and each element f in o, S,,;(A) € ©. 

Henceforth, if A «Ri, , we may merely write s,(A) in place of s,,;(A) and, 
if there is no question as to what matrix A is under consideration, we shall use 
s, in place of s,(A). 

If now A is a fixed element of St), , we may define the determinant of A as 


follows: 
/A | - 2. So, 8o, "** So » 


summed on all partitions of 1, 2, --- ,n into disjoint sets o;. If Ae C,, it 
may be verified that this definition agrees with the usual one. 

Before proceeding, it will be appropriate to investigate the class It’, somewhat 
more fully. It is to be noted that properties I and II are satisfied by arbitrary 
elements of ©, . In fact, if ¢ consists of two elements, say f and g, property I 
merely states that a,,a,; = a,4;, , while property II asserts that the product 
Qyegy iS Commutative with all elements of ®. If o contains more than two 
elements, the properties I and II are certain symmetry properties which are 
weaker than full commutativity but are automatically satisfied if commuta- 

tivity is assumed. 


3 This is Moore’s definition. See [6], p. 115. 








3 in 


n f. 


two 


are 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 325 


If R is an involutorial ring, then ’, contains all Hermitian matrices of %, .* 
It is rather easy to make up individual examples of elements of 2’, , for suitable 
choice of St, but these two classes—the arbitrary matrices over a commutative 
ring and the Hermitian matrices over an involutorial ring—are the only extensive 
classes of elements of 3t., which have been found, with the exception of such as 
can be obtained from these by the simple use of direct sums. 

We may now state 

THEOREM 1. (i) Jf A eR’, and ce G, then cA eR, and |\cA| = c"™|Al]; 
(ii) if cA eR, , ce € and c is not a divisor of zero in KR, then A eR, ; (iii) if 
A eR, , then X¥ + A € RAI), , where X is an indeterminate; (iv) if Ae Ri, , then 
any principal minor of A of order s is an element of KR. . 


These statements are almost obvious, except possibly for (ii). If cA e Ri, 
ce ©, then 


S,,s(cA) = Seg(cA) = c's, (A) = c'*'Se9(A), 


where r + 1 is the number of elements in c. From this it follows, under as- 
sumption that c is not a divisor of zero in ®, that 


S,,7(A) = S.9(A) = s,(A). 
Furthermore, since c’*’s,(A) ¢€ ©, it follows that for any element x in %, 
ac’*'s,(A) = c’*'s,(A)z, 
or 
ce’ [zs,(A) — s-(A)a] = 0. 
Hence é 
xs,(A) = s,(A)z, 


and thus s,(A) « ©. Thus A satisfies both properties I and II and is therefore 
an element of R’, . 

Now a study of Moore’s work ([6], pp. 116-124) reveals the fact that, although 
the taking of conjugates is frequently suggested as a simple way to obtain one 
part of a theorem from another, this process can be easily avoided throughout 
and all the theorems there established are true for elements of R’,. The only 
place where the nature of the ring, or the Hermitian character of the matrices, 
plays any essential part is in the proof of Lemma 16.2, which is precisely what 
we have assumed in properties I and II. However, before being able to estab- 
lish for elements of , the theorems concerning the determinant of adj A to be 
found on p. 125 of [6], it is necessary to show that if A is an element of Ri, , 
so isadj A. This result we shall establish in the next section, which is the only 
place in the paper we shall need to make detailed use of Moore’s notation. 


3. The main theorem about adjoints. Let o« denote a set of distinct indices 
in the range 1, 2, --- ,n. By —o we mean the elements of the set 1, 2, --- ,n 


4 This follows from Moore-Barnard [6], p. 114 and Jacobson [2]. 





326 NEAL H. McCOY 


which are not ino. If f and g are in o, we denote the set of elements of o other 
than f and g by o — (f, g). 

If AeR., by A’ we shall indicate the matrix obtained from A by striking 
out all rows and columns in —¢. Thus A’ is a principal minor of A, and its 


determinant may be denoted by a’. If o = (1, 2,---,m), the symbol o will 
be omitted, thus | A | = a. Wedefinea“’ = 1. We may now define adj A’ = 
(bj,) as follows: 
of, = a! (f in 0), 
ng—2 
by, = y (—1 ayn, Gnho °° * Gs,—,h, A, 2” OMae-°-Me , 
s=0 


where f # g and f and g are elements of ¢. Here n, is the number of elements 
in o and the second sum is taken over all permutations hf, , --- ,h, of each 
distinct combination of s elements of ¢ — (f,g). It is understood that, if s = 0, 
we mean simply a,a” “”. This definition is given here merely for the sake of 
completeness as we shall not have occasion to make use of it, but shall rather 
use one of Moore’s theorems based on it, which is seen to be true for elements 
of 9%, by the remarks at the end of the preceding section. This theorem will 
be stated below as Lemma 1. The main purpose of the present section is to 
prove 

THEeoreM 2. /f Ae RK; , then adj A e Ri. 

The proof will be based on several lemmas which we shall presently establish 
but first we give a short outline of the proof. Let m be an arbitrary positive 
integer not exceeding n — 1; let 7, 72, --+ ,%m be fixed distinct integers from 
the set 1, 2,--- ,n; and let f, g be also from this set but distinct from % , 
is, +*+,%m.-  Wedonot assume that f and g are necessarily distinct. Let us set 


(1) S(2 P le g TOPs Sen 4 q) => ym brn, Dryhe eee Din—yhmO hima 9 
summed on all permutations h; , he, +--+ ,hm Of 1, t2,°°+,%m. In view of 
the definition of 9’, by means of properties I and II, we only need to show that 
if A eR, , then S(i,, ig, ---,im3f, f) is symmetric in 7, ---,in, f and is 
also an element of ©. We shall obtain, by induction on m, a formula for 
S(t; , --- ,%m5J, q@) and then show that if f and g are identical, this expression 
belongs to € and is unchanged under any permutation of 7, --- ,im, f. 

As indicated above, our first lemma is a restatement of Theorem 16.7 of [6] 
as follows: 

Lemma l. Jf Ae Ri, o is a set of distinct indices in the range 1, 2,---,n 
and f, g are in a, then 


—o 
yh 
a’ by, = aby, + > by Dro » 
h 


it being understood that this sum is extended over all elements h of —c. 


5 This is the definition to be found in [6], p. 119, with a slight change in notation to avoid 
introducing part of the notation used there. 











‘oid 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 327 


As an illustration of the application of this lemma, suppose ¢ = —(7,). Then, 
in the notation introduced above, we find 
(2) S(a ;f,g) =a "by, — abj,"", 


which becomes, for g = f, 
S(ai;f,f) =a ‘a Y ~ ag 

This is clearly in € as each term on the right is a product of determinants of 
principal minors of A. Also, since these determinants are in G, it is clear that 
S(a: ; f, f) is unchanged by the interchange of 7; and f. It is this kind of caleu- 
lation that we need to generalize, and the main part of the work consists in 
generalizing formula (2). First we need some additional notation. 

Let m be a fixed positive integer not exceeding n — 1 and 7i,, 2, +++ ,%m 
a fixed set of distinct integers from the set 1, 2, ---,”. Let f and g, not neces- 
sarily distinct, be integers from the set 1, 2,---,n but different from 7, 
lo, ***,%m. Let ky, ---,k, be positive integers, not necessarily distinct, such 
that kj + ko + --- + kh, S m, and set m — (ki + --- +k,) =k = 0. The 
case in which r = 0 is not excluded, and we mean in this case that k = m and 
there are no elements in the set k;, ---,k,. Let us set 


= (u,-°- » ky), O2 = (tj41, °°" » Uk +ke)s hee 
Or = (Bayt. thpep td» °° * » Depts thy)s Orsi = (tet. thet "5 im). 


Thus there are k; elements in o; , --- , k, elements in o, and k elements in o,4; . 
We now introduce the notation 


(3) [hy ’ ke "Gate k, ; k] = aie ps ie I ee} “i's 


where the sum is to be taken over all permutations of 7; , #2 , --+ , 7m Which take 
the expression 

(4) Vi, Vig *** Vig, eel 2 Vig teeethpiyta °* Vins ee. ste, a ee. a yi” 
into distinct such expressions, it being understood that 2;, , +--+ ,%i,,, Yi» *** 5 
yi,, are distinct commutative indeterminates. We may remark that (4) is un- 
changed under permutation of the elements in any single product and also under 
interchange of two products of x’s which have the same number of letters. 
Also each factor of every term on the right side of (3), with the exception of 
factors of the form b;;"*', are in © and can therefore be arranged at pleasure. 
Hence [k; , --- , k, ; k] is unchanged under any permutation of k; , kz, +--+ ,k,. 

The main part of the proof of Theorem 2 is contained in the proof of 


LemMaA 2. We have 


(5) S(t . -@ on +e 9) - >> r\(—1)"" [ky ’ ke or oe k, ; k], 


summed over all different sets’ k,,--+ ,k, of positive integers, repetitions being 


6 Two sets are different if the elements of one can not be obtained by permuting the ele- 
ments of the other. 








328 NEAL H. McCOY 


allowed, such that ky + +++ +k, S m. As above, k = m — (ky + +++ + k,), 
r= ©. 


The proof is by induction on m. If m = 1, (5) reduces to (2), which is true 


by Lemma 1. Accordingly, we assume (5) for a fixed m < n — 1 and shall 
prove it for m + 1. An examination of (1) shows that we can pass from 
S(ti, +++, tm 3f,g) to S(t, +++ , maa jf, g) by the following induction operation. 
Replace g by in4:, multiply by b;,,,,, on the right and then in the resulting ex- 
pression perform in turn the transpositions (im4i , t1), -** , (mai, im) and add. 
We need to investigate closely the result of this operation on the right side of (5). 
To this end, we consider first a single term [k, , --- ,k, ; k] occurring on the 
right side of (5) and shall prove 

Lema 3. By the above defined induction operation, [ki , --+ , kr ; k] goes over 
unto 


—[ki, +++ ,kesk +1) +ofh,---,k,k +150], 


where these symbols are defined as in (3) with m replaced by m + 1 and a is one 


more than the number of the integers k, , --+ , k, which are equal to k + 1. 

If we perform the induction operation on [k,, --- , k,; k], we get 

. m—r 71 .—-¢ —Tr,—o 
(9) a p me 2 i ee dd aed 
where the inner sum is precisely the sum appearing in (3) while the outer one 
means a sum over the successive transpositions (im41, t1),°** , (¢m41, tm); 
(im+1, tm4i). Evidently (6) is symmetric in 7,, +--+ ,¢%m4i. Let us find the 
coefficient of a” ‘a “' --- a" in (6). A similar result will follow after any 
permutation of 7, i2,-°*:,%m4i- Clearly, 
- m—T -—¢ Co mi —@ 
(7) o's 4 Pesce be Pat Te 


where this sum is taken over the successive interchanges of 7,4; with elements 
of o-41, appears in (6). But, by Lemma 1 with ¢ = —(¢,41, im41), the ex- 
pression (7) is equal to 


8 a” "a a --- a fag tt th, — ab7 orth eth], 
to to 


Thus (8) is a sum of two terms, one from [k; , --- ,k,, & + 1; 0] and the other 
from —[ky,---,k-; k + 1]. It will be observed that we can obtain in this 
way from (6) each term of —[ki, --- ,k,; & + 1] once and only once. How- 
ever, each term of [k; , --- , k, ,& + 1; 0] is obtained once more than the number 
of times k + 1 appears among the set ki, k2,---,k,. For example, if k; = 
k + 1, there will be another expression similar to (8) with o; interchanged with 
(@r41, Im4i). The first term of this expression will actually be equal to the first 
term of (8), but the second term will be a term of —[k; , --- , k, ;& + 1] different 
from the one appearing in (8). These considerations establish Lemma 3 and 
we proceed to complete the proof of Lemma 2. 





wit 
coe 
un 
occ 


k;), 


true 
shall 
rom 
tion. 
r @X- 
add. 

(5). 

the 


over 


/ one 


one 
tm), 
the 
any 


ents 
ex- 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 329 


In view of our hypothesis that (5) is true, and Lemma 3, we have 


(9) S(ti, +++ ,%m4i 3 f, 9) 

= Dor\(—1)"“falki, «++ ke, +1;0] — hn, --+ ke 3k + I}, 
the sum being precisely the sum in (5). We wish to prove 
(10) Sli, -++ tmars fg) = Des)" th, +++ kU, 


summed over all sets |,, 2, ---,l, of positive integers such that 1, + kL + 
--> +1, S m+ 1, wheres 2 Oandl = m+ 1 — (i + --- + 1,). To this 
end we choose fixed integers  , 2, --- , l, , | satisfying these conditions and seek 
where [l, , --- ,/, ; [|] occurs in (9). If l ¥ 0, it appears once and only once, 
namely, in the term with k; = 1; (¢ = 1, 2,---,s), k = 1 — 1, and the coeffi- 
cient is s!(—1)"**~* as required in (10). Suppose now that / = 0, and let us 
assume that l,, , --- , l:; are the distinct integers among |, , 2, --- , 1, and that 
l,, appears p; times, l;, appears p2 times, and soon. Thus p, + --- + Pt; = 8. 
Then [l, , --- , 1, ; 0] appears in (9) with k; = 1; (¢ = 2,3,---,s),k=h—-—1, 
with a coefficient p(s — 1)!(—1)"**". Also this expression appears in (9) when 
ky = 1; (i ¥ tb), k = l,, — 1, with a coefficient po(s — 1)!(—1)"*™’, and so on. 
Adding all these coefficients, we see that [l, , --- ,l, ; 0] appears in (9) with a 
coefficient s!(—1)”"*'*, and this is the desired coefficient. The proof is there- 
fore completed. 

We may now complete the proof of Theorem 2. If, in formula (5), we replace 
g by f and expand each expression of the type [hk , --- , k, ; k] into a sum of 
terms as given in (3), we see that, since by definition b7/"*! = a “'*'”, every 
term is a product of determinants of principal minors of A. Since these are all 
in G, it follows that S(i,,--- ,im;f,f)«€. There remains only to show that 
S(t, , +++ ,%m3Jf, f) is symmetric in 4,---,%m,f. It is clearly symmetric in 
i, °** , 2m and we therefore only need to show that it is unchanged under the 
interchange of f and 7,. Consider a single term in the expansion of the right 
side of (5) with g = f. If, for example, m = 5 and 7 appears with f as in 


—(ip,ig) —(ig) —(41,%5,f) 
a 2 Va va 1,%5 . 


this term is clearly unchanged under the interchange of 7, and f. Corresponding 
to any term in which 7; and f do not appear together, say 


gy (its 2) ga) g—- Gar ts 


which comes from [1, 2; 2], there is another term 


qq 4 tot) gC D 


with 7, and f interchanged. This term comes from [1, 3; 1] and has the same 
coefficient in (5) as the preceding term. Their sum is obviously unchanged 
under the interchange of 7, and f. In general, one can thus pair all terms 
occurring in the right of (5) in which 7; and f do not appear together in such a 





330 NEAL H. McCOY 


way that the interchange of 7, and f leaves the sum of each pair unchanged. 
This completes the proof of Theorem 2. 

We may remark that, in the notation of §2, our proof really shows that, if 
A eM, , then s, (adj A) is expressible as a polynomial, with integral coefficients, 
in the determinants of the principal minors of A. In view of our definition of 
determinant, this means finally that s, (adj A) is expressible as a polynomial, 
with integral coefficients, in the different s,-(A). 


4. Further properties of adj A. In view of Theorem 2, it may now be verified 
that the theorems in [6], pp. 125-127, having to do with the determinant of 
adj A are valid for elements of Ri,. We shall explicitly mention only two 
results which will be used in the sequel, and which are well known for matrices 
over a commutative ring. These are as follows, where it is assumed that 


, 
pe ye. 


(11) A adj A = (adj A)A = | A], 
and 
(12) |adj A| = |A |" 


As a matter of fact, the first of these can be established without use of Theorem 2, 
while the second naturally requires the theorem. 

It is now easy to prove 

om , ° es ° , b © e 6 

THEeoREM 3. Jf A eM, and A is a divisor of zero in N, , then | A | is a divisor 
of zero in NR. 

For if AX = 0, X ¥ 0, multiplication by adj A on the left yields at once 
|A|X = 0. 

, . . . 

If now A e ®, and d is an indeterminate, we shall frequently make use of the 

characteristic polynomial of A, 


(13) fA) =|A—A] =A*™ + ar""*4+--- +4,. 


It is to be noted that the coefficients in f(A) are from € and, in fact, a; is, except 
possibly for sign, the sum of the determinants of the principal minors of A of 
order 7. It is easy to show that f(A) = 0, which will be a special instance of a 
more general theorem to be established below, and we shall for the present 
assume this result. We shall now prove 

TuEorEM 4. If A €), and the characteristic polynomial of A is given by 
formula (13), then 


(14) adj A = (—1)""(A"™" + aA" 


—?2 


"  o2* te Qe). 


Adjoin an indeterminate yu to ®, getting the ring M[u]. From (11), applied 
to u + A which, by Theorem 1, is in R{u];, , we get 


(15) (u + A) adj (un + A) =|n +A]. 








SY © oF ee 


red. 


t, if 
nts, 
n of 
ial, 


ified 
t of 
two 
ices 
that 


lied 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 331 


Now uw + A satisfies its characteristic equation, say 
(u + A)" + di(u + A)" + +++ +b, = 0, 
where b, = (—1)"|u +A. This may be rewritten in the form 
(16) (u + A)[(u + A)? +++ +b] = (-1I)"" |u + Al. 


From (15) and (16) it therefore follows that 


(u + A){(—1)"* adj (u + A) — [(u + A)" + --- + Bp-a]} = 0. 


Now |u + A| = uw” +--- is clearly not a divisor of zero in 2[u], and thus, 
by Theorem 3, » + A is not a divisor of zero in R{u],. Hence 
(17) adj (u + A) = (—1)""[(u + A)" + +++ + baal, 


and the desired result follows by equating the terms on both sides of (17) which 
are independent of u or, what amounts to the same thing, formally replacing 
bu by 0. 
. : aa 2 ° . aa , P 
From relation (11) and Theorem 2, it is obvious that if A «®, and | A | has 
Ps . . ~ . . / 
an inverse in ® (actually in ©), then A has an inverse in §, , namely 
A" adj A. We may also prove the following partial converse. 
rT - , . * ! . . . 
THeorEeM 5. If A eM, has an inverse in KR, and | A | is not a divisor of zero 
in MR, then | A | has an inverse in VR. 
If AX = 1, and we multiply on the left by adj A, we get 
(18) |A|X = adj A. 
rh "| , , . Ie _ ° 
Thus, by Theorem 2, | A | X e¢®, and, since | A | is not a divisor of zero in , 
. . . oe nn r , . . ° 
this implies by (ii) of Theorem 1 that X eR, . Taking determinants of both 
sides of (18), we have 


|A|"|X|=|A|"™. 
This, however, implies that | A |-| X | = 1, and the theorem is established. 
5. Minimum ideal and related topics. We now pass to a generalization of the 


familiar notion of minimum function of a matrix with coefficients in a field. 
If A eM, and g(A) = a + aA + --+ + 4,” is an element of R[A], we define 


gr(A) a + aA + +++ + 4,A™ 


and 
gi(A) = do co Aq, os coe aa A”Qm e 


It is easy to see that the set of all elements g(A) of RA] such that g,(A) = 0 
is a left ideal m, in RA]. Similarly, we may denote by m, the right ideal in 
RA] of all those elements g(A) such that g,(A) = 0. The totality of all elements 
h(d) of © such that h(A) = 0 is an ideal m in G[A], which we shall henceforth 





332 NEAL H. McCOY 


call the minimum ideal of A since it plays a réle similar to the ordinary minimum 
function of a matrix if the elements are from a field. Clearly m = m,fN G[A] = 
m,f GA]. It is also clear that C[A] & C[A]/m, so that the ring [A] is deter- 
mined when the ideal m is characterized. In general, the determination of m, , 
m, and m may be quite difficult, but if A e 9” we shall now show how these 
ideals may be determined. 

Let A be a fixed element of 9’, , with characteristic polynomial f(A) = 
|’ — A |, and let us set adj (A — A) = (hi(A)). We now prove’ 


Turorem 6. If Ae, , then g,(A) = 0 if and only if 


(19) g(A)hi(A) = 0 (f)) (J = 1,2,--+,) 
Similarly, g:(A) = 0 if and only if 
(20) hi(A)gA) = 0 (fA)) (7 = 1,2, ---,m). 


We shall prove the first part of the theorem. If we assume relation (19), 
we have 


g(r) adj (A — A) = Bf(r), 
where B e RA], . Multiply this equation by \ — A on the right, thus getting 
g(A)f(A) = BfA)(A — A) = BA — ADf(A). 
Now f(A) has leading coefficient 1 and is therefore not a divisor of zero. Hence 
(21) g(A) = B(A — A), 
and the factor theorem (ef. [1], p. 26) shows at once that g,(A) = 0. 

Conversely, if g,(A) = 0, the factor theorem states the existence of a relation 
(21) and multiplication by adj (A — A) on the right yields 
(22) g(A) adj (A — A) = Bf(r), 
which is equivalent to (19). The second part of the theorem can be established 
by a similar argument. 

We may easily establish the following 

Corottary. If A eR, and g(d) = 0 (m), then 

[g(A)]" = 0 (fA)), 
and thus m and f(d) have the same prime ideal divisors in € [X]. 

If g(A) = 0 (m), then by Theorem 2 the left side of equation (22) is in RAJ}, , 
and the same is therefore true of the right side. Since f(A) is not a divisor of 
zero in R{A], this implies, by (ii) of Theorem 1, that Be R[A],,. We may there- 
fore take determinants of both sides of (22), getting 

Alfa" = | Bi Fay", 
from which it follows that 


(g)]" = | Bl FQ). 


7 This is essentially the proof for the commutative case to be found in [5] and it is included 
here for the sake of completeness. 





Se a a ee 


— <> & @&. 


an 


ing 


1ce 


ion 


ied 


led 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 333 


6. Generalization of Ostrowski’s theorem. Recently, Ostrowski [7] has 
extended a theorem due to Phillips [8] and thus obtained for the case of several 
matrices with coefficients in a field an analogue of the notion of minimum 
function. We shall now show how this result can be extended to the general 
case under discussion in this paper. The case of an arbitrary commutative ring 
has already been treated in [4], and we shall present in detail only those parts 
which have to be essentially modified in the non-commutative case. 


Let 21, --- ,2%m be commutative indeterminates and let us denote by S the 
ring Nia, ---,2m|. Throughout this section, we asswme that A, (k = 1, 
2,---,m) are fixed elements of N, such that the matrix 
(23) WA, + t2A2 + °°: + LmAm 


is an element of S,. This implies, in particular, that each A; is in R,. The 
above stated condition will certainly be satisfied if 9% is a commutative ring and 
the A; are arbitrary elements of ®, or if R is an involutorial ring and the A, are 
Hermitian. 
Let us set 
| mA + eee oa Sahin | = F(x g teh» Sands 


which is an element in the center of S. Further, let 


adj (1%A1 + +++ + 2mAm) = (Fij(ti, +++ , 2m)). 


Denote by n, the left ideal in S of those elements f(x , --- , 2m) such that 
(24) S(ati, +++, Lm) Fi(t1, >> , Lm) = 0 (F(a, +++, %m)) ij= 1,2,---+,n). 
Similarly, let n, be the right ideal consisting of those elements f(x , --- , tm) 


of S such that 
(25) Fi(x1 , ale » Su)f(2r ; che » Zu) =0 (F(a, tele » Lm)) (i,9 - 1, 2, neo ,n). 


We may now prove 


THEOREM 7. Let A; (k = 1, 2, --+ , m) be elements of N,, such that the matrix 
(23) is an element of S:, and such that F(x, , --+ , 2m) is not a divisor of zero in S. 
Let B,, Bz, --- , Bm be elements of R, , commutative in pairs, such that 
(26) A,B, os AoB, + ee? + Buiic = 0. 


Then if f(t, ++ ,2m) = O (1), it follows that f,(Bi,---, Bn) = 0.5 Simi- 
larly, if 

(27) B,A; + BrAz + «++ + BnAn = 0 

and f(a, +++ ,2%m) = 0 (n,), then fi(Bi,--:, Bn) = 0. 


We may remark first that although it is not a necessary condition, 





F(a, , +++ ,2%m) will not be a divisor of zero in © if, for any k, | A; | is not a 
8 By this notation, we mean that in f(z. , --- , 2m) we replace 2; by B; (¢ = 1, 2, --+ , m) 


and write the power products of the B’s on the right in each term. 








334 NEAL H. McCOY 


divisor of zeroin ®. This is evident since the term in F(x; , --- , 2») of highest 
degree in x, is precisely | A, | 2;. 

We shall prove the first part of the theorem only, as the second part can be 
proved similarly. The proof is an adaptation of the proof given in [4] for the 
commutative case. 

Let us set A, = (a$’). By hypothesis that f(z, , --- , am) = 0 (1), it follows 
that 


S(t t+ y Sm) Pili, +++, Sm) = ili, -+> , tm)F (ti, +++ , Lm) 
(28) 
(7,7 ™ 1, 2, Bte7 2), 
where the /j;(271, +--+ , 2m) are elements of S. Furthermore, by formula (11) 


we have 


m 
, k) 
Falay, +++ 5 tm) > aaj; = b:;F (a, +++ , tm). 


l=1 k=1 
If we multiply this by f(x, +--+ ,2m) on the left, make use of equation (28) 
and cancel F(x, , --- , 2m) from both sides since it is not a divisor of zero in S, 
we get 
n m 
(29) > halar, +++ 5 tm) > af? = f(a, «++, tm). 
l=l k=1 


Now let us set 


and 
. ae (a) (a) 
hala, ah 2 » Za) _— > al? Lil» 


a 


where x‘; indicates the different power products of the 2x’s which appear 
in hy(ati, -**, Xm). This notation merely serves the purpose of enabling 
us to separate the coefficients in hi(a,--- ,2%m) from the power products of 
the x’s. Now, in (29), the 2’s are commutative with everything and if we 
write them on the right in every term on both sides of (29), the equation (29) 
then says merely that the coefficient of any power product of the z’s is the same 
on both sides. The equation therefore remains true if we replace x, by B, 
(k = 1, 2, --- , m), thus getting 


n 


b | 
(30) > DAP Cy BSP = 5if-(Bi, «++, Bn). 


i=l “a 
It is naturally understood that BS? represents the same power product of the 
B’s that x$f? is of the z’s. 
Now if E,; is the matrix with 1 at the intersection of the 7-th row and j-th 
column and zeros elsewhere, we may rewrite (26) as follows: 


re 2, 2, a1; Bij Be = pa Esj 2, as; Bs = p> Ej Cy. 


=1 i,j=1 iij= i,j=1 





hest 


n be 
the 


lows 


year 
ling 
s of 

we 
(29) 
ame 

B, 


the 


j-th 











ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 335 
Multiplication by E,, on the left yields 
n n 
O= DO Buby = DL BuCy. 
i,j=l j=1 


Now multiply this equation by A$? on the left, BS? on the right, and sum on a 
and/. The matrix £,; is commutative with h{/’, and we therefore get, by mak- 
ing use of (30), 


II 


0 ym Ey; > RP Cy; BYP = > ™ Fyjbiif-(Bi, Pre Bn) 
j=1 l 


a, j=1 
= Eu f(Bi, ne Bn). 


If we multiply this by EZ» on the left and sum on i, it follows that 
f(Bi, «++, Bm) = 0, which is the desired result. 

It is possible to prove a partial converse of the preceding theorem by making 
only trivial changes in the proof for the commutative case (cf. [4], p. 494). 
Accordingly, we shall merely state this result without proof as 

THEorEM 8. We assume that A; = 1, and also that R has the property that, 
if g(d) e RIA] such that g(a) = 0 for every element a of ©, then g(A) = 0. If 
f(ti, +++, 2m) tis an element of S such that f,(B,, --- , Bm) = 0 for every choice 
of matrices B,,--:,Bm which are commutative and satisfy (26), then 
f(t, +++ 2m) = 0 (nj). 

A similar result holds if r and 1 are interchanged and relation (26) replaced 
by (27). We may remark also that considerations similar to those used in the 


proof of the corollary to Theorem 6 show at once that, if f(a, --- ,%m) is in 
Giz, , +++ , 2m] and is an element of n, (or n,), then 
[f(x , ++ 5 2m))” = 0 (F(x, -** 2m). 


7. Quaternion rings. Definition of class K. There are a number of theorems 
which are known to be true for matrices over a commutative ring and which 
we are unable to establish for arbitrary elements of R’,. We proceed to restrict 
further the class of matrices considered, and to this end we pause to introduce 
the concept of a quaternion ring. 

Let € be a commutative ring with unit element 1, in which the equation 
2x = 1 has a unique solution x = 3. We consider the linear form module of 
expressions of the form 


(31) a = CoWo a Cy, + CoWe + C3W3 (c; in G), 


where wo, Wi, W2, W3 are linearly independent over € and are commutative 
with elements of €. We assume for the w; the following multiplication table: 


* 2 2 2 
WW; = WiWo = Wi (i = 0, 1, 2, 3), W=?P, w=d, Ws = —?P4d, 


WW. = —Wet) = W3, WiW3 = —W3W, = PU2, WoW3 = —W3l2 = —qQui, 








336 NEAL H. McCOY 


where p and gq are elements of ©. We assume further that if x is an element 
of © such that pr = gx = 0, then x = 0; in other words, p and q are not simulta- 
neously annihilated by any non-zero element of €. Under these hypotheses, 
the totality of elements of form (31) is a ring R which we call a quaternion ring 
over € or simply a quaternion ring. Clearly R contains a subring of all elements 
of form cow) , which is isomorphic to € and which we identify with €. Thus, 
henceforth we shall write 1 in place of wo . 

It is easy to show that € is the center of R. For if a, defined by (31), is in 
the center of R, then aw; = wa implies that ag = 0 and pa; = 0, while aw, = wea 
implies that a, = 0 and ga; = 0. By our assumption concerning p and gq, it 
follows that a; = 0 and hence a = qe €. Clearly elements of € are in the 
center of % and therefore € coincides with the center. 

We may naturally define conjugates in the usual way. If a is given by (31), 
then a — 4G, where 


a = Co — CyW1 — CoW2 — C3W3 


defines an involution in ®, and further the symmetric elements are precisely the 
elements of ©. Thus ® is an involutorial ring as defined in the introduction. 
Obviously, generalized quaternion algebras over a field of characteristic other 
than 2 are instances of quaternion rings. We may also remark that the com- 
plete matric ring ©; is a quaternion ring over © as can be seen by setting p = 1, 
q=-—1. Then if E;; denotes the matric units used in the preceding section, 
we have Ey = (1 + Wi), Ew = 3 (ws + We), Ex = 4 (ws =“ We), Ex = 3(1 —_ W)). 
Thus if 


fg 
a= , 
hi 
it follows that 
: t -9 
= ~— 


This example of ©, , as an involutorial ring, for the case in which € is a field 
not of characteristic 2, has been used by Jacobson in [2]. 

Now if A is a Hermitian matrix over the quaternion ring ® and g(A) e GA], 
then g(A) is also Hermitian. Further, if A and B are Hermitian, and AB = BA, 
then also AB is Hermitian. Now it will be found that certain theorems about 
Hermitian matrices over a quaternion ring coincide in statement with known 
theorems about arbitrary matrices over a commutative ring. At least the 
statements of the theorems, and in some cases the proofs also, can be unified 
by the following notation. We shall henceforth restrict ourselves to the follow- 
ing two situations. 

Case 1. is a quaternion ring with center ©. We shall denote by K the 
class of all Hermitian matrices of ®, . 





ment 
ulta- 
1eSes, 

ring 
1ents 
‘hus, 


is in 
> Well 
q, it 
| the 


31), 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 337 


Case 2. ® is an arbitrary commutative ring with unit element and center 
€ = ®R. In this case, we let K = NR, = CG. 

In either case, if A ¢ K and g(A) € GA], then g(A) « K. Also, if A e K, then 
A «9, and we have at our disposal all theorems proved for elements of Ri, . 


8. Some miscellaneous properties of elements of “K. We begin with a 
theorem which is trivial in Case 2 and which, as a result of Moore’s work ([6], 
p. 147), is also known to be true for Hermitian matrices over a “number system 
of type B’’. We shall, in fact, base our proof of the theorem only on the fact 
that it is true for Hermitian matrices over real quaternions. The result in 
question is the following 


TueoreM 9. If A € %K, gi(d) € GA], go(A) € G[A], then 
| 91(A)go(A) | = | gi(A) |-| g2(A) |. 


We need to establish this result only in the first case. Accordingly, we 
assume that A = (a;;) is a Hermitian matrix over the quaternion ring R with 
center ©. Let 

gi(A) = do + ad + +++ + ar! 
and 
go(X) = bo + bid + +++ + dd’. 


We now need to establish some identities which are useful in establishing the 
theorem. 


Let X denote the ring of rationals, and let a, 8, xi; (¢ = 1, 2, ---, n), ars 
(¢ = 0,1, 2,3;7,7 = 1,2, ---,n;i <j), ys (@§ = 0,1,---,k) andza (t= 
0, 1, --- , 2) be indeterminates, and let TE’ be the ring obtained from T by ad- 


joining all these indeterminates. Further, let 9’ be the quaternion ring over 
T’ defined as in the preceding section with a and 8 taking the place of p and q in 
the multiplication table. We now introduce a matrix B = (b;;), where bi = xii 
(¢ = 1,2, ---,n), by = 2) + aiPw, + cfu. + xiPus (i,j = 1,2, --+, 0; 


i <j), and b;; = 6; (¢ > 7). Also let us set 
hi(B) = y +yBt+ --- + yB* 


and 
ho(B) = 2% + 24B+--: + 2B. 


Thus h;(B), ho(B) and h;(B)ho(B) are Hermitian matrices over Jt’, and thus 


F= | hy(B)h2(B) | - | hi(B) |-| he(B) | 


is a polynomial in the various indeterminates, with integer coefficients. We 
shall show that this is actually the zero polynomial. To this end, let us for- 
mally replace the 2x;; , 2i; , yi , 2; by arbitrary real numbers and a, 8 by arbitrary 
negative real numbers. By this specialization, hi(B), he(B), and h,(B)he(B) 


338 


NEAL 


H. Mccoy 





become Hermitian matrices over the algebra of real quaternions’ and hence F 


vanishes since our theorem is true for this case. 


such specialization, it follows readily that F = 0 as an element of SW’. 


Since F vanishes for every 


Since F 





has integer coefficients, the relation F = 0 remains true if the indeterminates 
are replaced by arbitrary elements of a commutative ring € with unit element, 
it being understood that an integer m is to be replaced by m-1, where 1 is the 
unit element of GC. 
indeterminates in B by elements of ©, B goes over into A, while h;(B) goes over 
into g:(A) by also replacing y; by a; , and ho(B) becomes go(A) upon replacing 
z; by b;. Hence the theorem is established. 

Now let A be an element of K with characteristic polynomial f(A) = 
" + + a,. Let g(A) = bor” + DA™™ + + b» be an element of 
GA], and let us define the resultant K(f, g) of f(A) and g(A) in a formal way 
by the Sylvester determinant 


It is clear, however, that by proper replacement of the 


\A—-a\= 


1 ay 7 " P Ap, 
-_— eee 
Rif, 9) = 1 a An 
ee by bi bm 
bo by . . ’ bm 
SS aoe 


the blank spaces consisting of zeros. We may now prove 


TuHEeoreM 10. If A ¢ K and g(a) e GA], then 
(32) | g(A) | = K(f, 9). 


Thus, in particular, if A, and Az are elements of “K with the same characteristic 
polynomial, then 

| g(A1) | = | g(A2) | 
for every element g(r) of GA]. 

The proof for the commutative case has already been given in [3] and is 
naturally based on the similar theorem due to Frobenius for the case in which 
the coefficients are from an algebraically closed field. The proof for quaternion 
rings can be carried through in the same general way as the proof of Theorem 9 
except for the necessity of proving relation (32) for the special case of quater- 
nions over the reals. 

Thus, suppose A is a Hermitian matrix over real quaternions and let g(A) 
Now g(A) can be factored in the real field into a product 
Since 


have real coefficients. 
of linear or quadratic factors. 
Rif, 9:92) = Rif, gR(f, gz), 


® To get the usual basis we only need to replace w; by w:/(—a)!, we by w2/(—8)* and 
W3 by W3/ (aB)}. 








’ 


ice F 


very 
ice F 
nates 
nent, 
s the 
f the 
over 
cing 


istic 


1 is 
Lich 
Lion 
m 9 
ter- 


(X) 
uct 


and 








ALGEBRAIC PROPERTIES OF CERTAIN MATRICES OVER A RING 339 


it is sufficient, in view of Theorem 9, to prove formula (32) for linear and quad- 
ratic g(A). A direct calculation shows it to be true for linear g(\) and therefore 
also for quadratic g(\) which can be factored into real linear factors. Now set 


g(X) = 2 + avd + ar, 


with indeterminate 2, 2:, 22. Then | g(A)! and &(f, g) are polynomials in 
Z , 21, 22 Which are equal for all choices of real 2 , 2; , 22 such that zi — daz. = 0. 
This implies, however, that | g(A) | and R(f, g) are identically equal as poly- 
nomials in 2, 2, 22 and thus are equal for all real quadratic g(A). Relation 
(32) is established for the case in which A is a Hermitian matrix over real 
quaternions and g(A) has real coefficients. The rest of the proof of Theorem 10 
follows by arguments similar to those used in the proof of the preceding theorem. 
The following result is fundamental for certain purposes: 


TuHeoreM ll. If A ¢€ XK, the following statements are equivalent: 
(i) | A | ts a divisor of zero in R. 

(ii) A is a divisor of zero in RN, . 

(iii) A is a divisor of zero in G{A]. 


It is obvious that (iii) implies (ii) and furthermore (ii) implies (i) by Theorem 
3. The essential part of the theorem is contained in the proof that (i) implies 
(iii). This has been established in [3] for Case 2 and inasmuch as this proof will 
apply equally to the first case, we shail not reproduce the proof here. We may 
remark that if | A | is a divisor of zero in ®, it is also a divisor of zero in GC. 
This fact is important in the proof. 

An element g(A) of GA] is said to be prime to an ideal m if g(A)h(A) = 0 (m) 
implies that h(A) = 0 (m), it being understood that h(A) « CfA]. If m is the 
minimum ideal of the matrix A, it follows from the preceding theorem, to- 
gether with the fact that €[A] = C[A]/m, that g(A) is prime to m if and only if 
|g(A) | is not a divisor of zero in ©. But | g(A)| = R(f, g) and it has been 
shown elsewhere (see [3], p. 170) that g(A) is prime to (f(A)) if and only if R(f, g) 
is not a divisor of zero in ©. We have therefore proved 

TueoreM 12. Jf A ¢ XK has characteristic polynomial f(\) and minimum 
ideal m, an element g(r) of G[A] is prime to m if and only «if it is prime to (f(d)). 

Now Theorems 5 and 11 show at once that if A « K, then A has an inverse 
in ®, (actually in G[A]) if and only if | A | has an inverse in Rt. Since, by the 
corollary to Theorem 6, if h(A) = 0 (m), then [A(A)]" = 0 (f(A)), it is easy to 
see that (h(A), m) = (1) if and only if (A(A), f(A)) = (1). The following result 
is then almost obvious. 


THEoreM 13. If A ¢ K and h(A) € GIA], then h(A) has an inverse in GA] 
if and only if in G[d] we have (h{A), f(A)) = (1). 


We conclude with an application of this result. If A and B are elements of 
Kk and AB = BA, then an examination of the Sylvester identities (see [9], 








340 NEAL H. McCOY 


p. 27) shows that they can be carried over to the present situation. Hence, 
in particular, there is an expression of the form 


f'(A)B = g(A), 
where f’(\) is the formal derivative of f(A) with respect to A, and g(A) e« GA]. 
If now (f’(A), f(A)) = (1), f’(A) has, by Theorem 13, an inverse in G[A]. We 
have therefore proved 
THeoreM 14. Jf A «eK and in GA], (f(A), f(A)) = (1), the only elements of 
K commutative with A are the elements of C[A]. 


BIBLIOGRAPHY 


1. A. A. AtBert, Modern Higher Algebra, Chicago, 1937. 
. JACOBSON, An application of E. H. Moore’s determinant of a Hermitian matriz, Bulletin 
of the American Mathematical Society, vol. 45(1939), pp. 745-748. 

3. N. H. McCoy, Divisors of zero in matric rings, ibid., vol. 47(1941), pp. 166-172. 

4. N. H. McCoy, A generalization of Ostrowski’s theorem on matric identities, ibid., vol. 46 
(1940), pp. 490-495. 

5. N. H. McCoy, Concerning matrices with elements in a commutative ring, ibid., vol. 45(1939), 
pp. 280-284. 

6. E. H. Moore anno R. W. Barnarp, General Analysis, Part I, Philadelphia, 1935. 

7. A. Ostrowsk1, On a theorem concerning identical relations between matrices, Quarterly 
Journal of Mathematics, vol. 9(1938), pp. 241-245. 

8. H. B. Parties, Functions of matrices, American Journal of Mathematics, vol. 41(1919), 
pp. 266-278. 

9. J. H. M. Weppersurn, Lectures on Matrices, American Mathematical Society Collo- 
quium Publications, vol. 17(1934). 


bo 
A 


SmitH CoLuece. 








ence, 


GIA}. 
We 


nts of 


illetin 


ol. 46 


1939), 


rterly 
1919), 


Sollo- 








CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 


By S. A. JENNINGS 


In this paper we have two main ideas in mind. It is well known that there 
is a considerable similarity between those properties of a group associated with 
its commutator structure and the theory of Lie rings.’ We show first that, with 
suitable definitions of “commutator ideal’, many of the properties of com- 
mutator subgroups have analogues in the theory of associative rings. In par- 
ticular, we are interested in extending the notions of “nilpotent group” and 
“solvable group” to rings. On the other hand, every associative ring determines 
a Lie ring, which we have called the “associated Lie ring” (cf. §6), and the 
question arises as to how far the solvability or nilpotency of this Lie ring deter- 
mines corresponding properties of the original associative ring as we have defined 
them in the first part. In the case of algebras we answer this question com- 
pletely and show that if the Lie algebra is solvable (or nilpotent) the associative 
algebra has the corresponding property. For a general ring, however, we obtain 
only a partial answer. 

In seeking an analogue for the commutator subgroup of two given normal 
subgroups, a difficulty arises at once. As is well known, the subgroup generated 
by all commutators of a group is a normal subgroup; in a ring the subring 
generated by all elements of the form ry — yz is not in general an ideal. We 
overcome this difficulty by defining the ‘‘commutator ideal” of two given ideals 
A, B of the associative ring R as the smallest ideal of R containing all elements 
of the form ab — ba, wherea e A,b eB. However, there are some disadvantages 
to this definition, as compared with the corresponding one in the theory of 
groups, and in consequence, commutator subgroups enjoy some properties which 
have no analogues for commutator ideals. In a group, a normal subgroup, 
besides having the property that its residue classes again form a group, is such 
that it is transformed into itself by inner automorphisms. Ideals in a ring play 
no such dual réle, and in general those properties of commutator subgroups 
depending on the second of these facts have no analogue in our theory. 


1. Commutators. Let A and B be any two (not necessarily distinct) ideals 
of an associative ring R. We define the commutator ideal, Ao B, of A and B 
to be the ideal of R generated by all elements of the form ab — ba, where a ¢ A, 
be B, that is, Ao B is the smallest ideal of R containing all elements of the 
form ab — ba. 

In what follows, it will be convenient to write zy — yx as ro y; to avoid the 
possibility of confusion between this notation and that defined above for com- 


Received November 8, 1941. Presented to the American Mathematical Society, Septem- 
ber 2, 1941. 

1 For properties of commutator subgroups see [1] and [7]. (Numbers in brackets refer 
to the bibliography.) The relations between groups and Lie rings are discussed in [5] and [6]. 


341 








342 S. A. JENNINGS 


mutator ideals, we shall use lower case letters for individual ring elements and 
denote ideals by capital letters. 

We consider some elementary properties of commutator ideals. Using our 
definition, any element of A o B may be written as a sum of one or more elements 
of the following types: 


ao b, xr(ao b), (ao b)y, x(ao b)y, 


where a, b belong to A, B respectively, and x, y are in R. For any three ele- 
ments of R, however, we have the Jacobi identity 


(royloz+ (yoz)oxr+ (zoxr)oy=0 
and hence 
(ao b)y = y(ao b) + (ao b)oy 
= y(ac b) + ao (bo y) + (ac y)od; 


since A, B are ideals, ao ye A, bo ye B, and hence (ao b)y may be expressed 
as a sum of elements of the types (ao b), z(ao b). Similarly x(ao b)y may be 
written as a sum of elements having the form xz(ao b). We have proved, there- 
fore, that any element of A o B may be written as a sum of elements of one or 
both of the types 


(1A) ao b; x(ao b). 


If C is the subring of R generated by all elements ao b, where ae A, be B, 
the remark embodied in (1A) is equivalent to the equality 


(1B) AoB=CUAJ(R-C), 
and if R has a unit element we have 

(1C) AoB=R-C. 

We have also for any ring R 

(1D) AcoB=BceA, 
(1E) ‘AcoBGA-BEAUB. 


It should be noted that our definition of commutator ideal depends not only 
on the ideals A and B but also on the underlying ring R; for example, if S is a 
subring of R containing both A and B, then the ideal A o B taken relative to the 
ring R will in general be greater than the ideal A o B relative to S. This state 
of affairs does not arise in the theory of groups or Lie rings. For the most part 
no ambiguity arises, and unless otherwise stated (cf. §5) all commutator ideals 
are to be taken relative to the whole ring. 














CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 343 


2. Central chains of ideals. Let 


(2A) R = M, 2 M, 2 plies = M » = M n+ —_ 0 


be a chain of ideals of R. The chain (2A) will be called a central chain of ideals’ 


if we have 

(2B) Ro M; & Mix (¢ = 1, 2,---,m). 
If the situation is as in (2A) with M,,4; equal to the zero ideal, we say m is the 
(formal) length of the chain. It will be convenient to define Mniso = Mnas = 

- = 0 for any chain of ideals. The condition (2B) is equivalent to the condi- 
tion that M;/M;.,; belong to the center of R/M;4; (¢ = 1, 2,--- ). 

Rings which possess central chains of ideals have special properties; we in- 
vestigate some of them by considering a particular central chain. For any ring 
R we may form a descending chain of ideals R = H, > H; D2 --- by setting 
(2C) R = Hi; Hout = H,o Rfor p = 1. 

We say that the ring R is of finite class if H,, = 0 for some m; if the situation is 
as follows 


(2D) R = H, DB; D-:-- DH. PD Hens = 0, 


then c will be said to be the class of R. The chain (2D) will be called the lower 
central chain of R; because of (2C) it is clear that the lower central chain is a 
central chain as defined above, and hence every ring of finite class has a central 
chain of ideals. 


THEOREM 2.1. A necessary and sufficient condition that a ring R have a central 
chain of ideals is that R be of finite class. The length of any central chain of R is 
at least equal to the class of R, and if 


R= M,2M2 2::: 2 May = 0 


is any central chain, and 
R= H,>48,>::->2H.> Hew = 0 
is the lower central chain of R, we have 
M; 2H; (¢ = 1,2,---). 
Proof. Suppose that R has a central chain 
R= M,2 M22 ::: 2 Mays = 0; 


then since H, = M, = R, we can suppose that H;.. € Mj, and show that 
this implies H; ¢ M;. However, since, by (2C) and (2B), 


H; = (Hii 0 R) S (Mic R) G M;, 


2 For the corresponding concept in the theory of groups see [1] or [7]. 








344 S. A. JENNINGS 


we have 

HAH; SM; 
as required. Now since M4; = 0, Hn4i = 0, and R is of finite class. The 
remainder of the theorem follows at once. 

The following theorems show that rings of finite class are fairly common. 

THEOREM 2.2. A necessary and sufficient condition that a ring R be commu- 
tative is that R be of class one, that is, Hz, = 0. 

Proof. lf R is commutative, then ry — yx = 0 for all z, y e R, and hence, 
by (1A) and (2C), H, = 0. Conversely, if H, = 0, all expressions of the form 
xy — yx must vanish, and hence R is commutative. 

For any ring R, the ideal Ro R(= He) will be called the derived ring. It is 
characterized in the following 

THEOREM 2.3. The derived ring of R is the intersection of all ideals of R whose 
factor rings are commutative; R/(Ro R) is commutative and if A is any ideal of R 
containing (Ro R), then R/A is commutative. 

Proof. If R/A is commutative then xy — yx = O modulo A, and hence 
(Ro R) G A. Conversely, if (Ro R) E A then ry — yx = 0 modulo A, and 
R/A is commutative. 

THEOREM 2.4. Every nilpotent ring N is of finite class, and the class is at most 
equal to the exponent of N. In particular, the powers of N 

- . 72 = yA+ 
N=NOND-:--DNON*"*=0 
form a central chain of ideals of N with the stronger property 
N*®o N* G N*™. 


Proof. If N has exponent \, then from (1E) and (2C) we have H, ¢ N°® and 
hence Ay4, = 0. Similarly N’o N°’ ¢ N°’-N’ N°”. 


3. Properties of the lower central chain. We consider now some further 
properties of central chains, and in particular, of the lower central chain.’ 
Unless otherwise mentioned, H; will denote the 7-th member of the lower chain 
of an arbitrary ring R of finite class. To stress the fact that H; belongs to the 
ring R, we sometimes write H; = H;(R). 

THEOREM 3.1. If R = M; 2M, 2:--+ 2 Many = 0 ts any central chain of 
ideals of R, and R = H, D> H, D--+ D H.4; = 0 ts the lower central chain, then 
for p,o = 1,2,:-:- 

H,-M. & Myi-1,; 


My, Moses « 


3 The theorems in this section, with the exception of (3.1), (3.3) and (3.9), have analogues 


in the theory of groups. 





es 








CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 345 


THEOREM 3.2. Under the same hypothesis as (3.1), 
H,o M, & M,+. (p,o0 =1,2,---). 


We prove first (3.1) and use it to establish (3.2). 

Since M, is an ideal and H, = R, we have H,-M, ¢ M,, and hence (3.1) 
holds for p = 1 andallc. We use induction and suppose that H,-M, © M,4.-1 
for tr S p — 1 and all c; we must show that this implies that H,-M, © M,4.-1 
for all ¢. Let a be any element of R, h,; any element of H,_,, and m, any 
element of M,. Then h,1.0 ais in H,. Consider the element (h,_;0 a)m,. 
Using the identity 


(xo y)z = x(yoz) + (xz)oy, 
we have 
(h,1.0° a)m, = h,4(ao me) + (h,-1-mM,) o a. 


Now @o m, € M,,,, and hence by our induction h,_1-(ao m,) ¢ M,4,-1 ; again, 
h,1-mM, € M,,¢-2 and hence (h,1-m,)oaeM,,,-1, which proves that 
(h,1.0 a)m, € M,4o-1. Now by (1A) and (2C) every element of H, may be 
written as a sum of elements of the forms (h,_:0 a), x(h,_,;0 a), where x runs 
over all elements of R, and it follows that H,-M, © M,4.-1, as required. 
Similarly, using the identity 
2(xo y) = (zx)oy + (yoz)z 

we may prove that M,-H, & My4.-1. 

To prove (3.2) we note that for all ¢, Hjo M, | M,4:, since M, has the 
property (2B). We suppose, therefore, that H,o M, ¢ M,,, for r S p — 1 
and all o, and proceed as above. Using the same notation as before consider 
(h,10 a)om,. The Jacobi identity gives 


(h,-10 @)o Ms = h,_10 (@o m,) + Go (m,o h,_4) 
and hence by our induction (h,_;0 a) o m, belongs to M,,,. For brevity, set 
c, = (h,,:0 a); our theorem will be proved if we show that every element 
having the form (xc,) o m, belongs to M,,,. We have identically 
(xc,)o mM, = xr(c,0 Mz) + (Xo mM,)e, . 


Since we have shown (c, o m,) € M,44,2(¢,0 me) €M,44. Now (xo m,) € Moai, 
and hence, using (3.1), (xo m,)c, ¢«M,.,, and hence (zc,) o m, e M,,,, and we 
have our theorem. 

As immediate corollaries of the above we state 


THEOREM 3.3. H,-H, & Hoso-r. 
THEOREM 3.4. H,o He & Hose. 


We consider now some further properties of the commutator ideal A o B. 
The following lemmas are readily verified. 








346 S. A. JENNINGS 


Lemma 3.5. Let the ring R be mapped homomorphically on the ring R, R > R, 
and let A and B be any two ideals of R and A, B the corresponding ideals in R such 
that A— A, B-—> B. Then in this homomorphism any element of A o B is mapped 
upon an element of Ao B. 

LemMa 3.6. Under the same assumptions as in (3.5), let M be the ideal in R 
such that R/M = R. Then the ideal (A o B) U M maps completely into the ideal 
Ao BinR, that is, every element of A o B is the image of an element in (A o B) UM. 

As consequences of (3.5) and (3.6) we have 

THEeorREM 3.7. If M is any ideal of R such that R/M is of class p — 1, then 
M = H,; in other words, H, is the smallest ideal in R whose factor ring is of 
class p — 1. j 

TuroreM 3.8. If M is any ideal in R, then H, U M is the ideal in R which maps 
completely upon the ideal H,(R/M) in the homomorphism of R upon R/M. 


For later use we state the following consequence of (3.3). 
TuHEeoreM 3.9. If R is of finite class the ideal Ro R is nilpotent. 


Proof. Since Ro R = Hz, we have H.-H. & H;, and in general He & Ae, 
and hence H; = 0, which proves that H2 is nilpotent, of exponent at most c — 1. 


4. The upper central chain. For certain types of rings of finite class we may 
define also the upper central chain of ideals. For the rest of this section, let R 
be a ring of class c with ascending chain condition for ideals.‘ Consider the 
ideal H, ; since H.o R = H.4, = 0, H, belongs to the center of R, which is 
therefore not zero. We see also that every ring of finite class contains at least 

19 5 


one “central ideal”,’ namely, H.. Let Z, denote the maximal central ideal of R. 
If we set Z) equal to the 0-ideal, we define the upper central chain of R 


(4A) 02#2Z2,C4,C4aC::-CZ=R 


by setting Z; equal to the ideal in R which maps upon the ideal Z;(R/Z;1) of 
R/Z;-1 , that is, Z; is the maximal ideal of R contained in the center of R modulo 
Z:1. Itis clear that the chain (4A) is a central chain, since if a is any element 


of R and z; belongs to Z;, then az; — za = O modulo Z;_,, and hence 
RoZi¢& , 
Let 


R = M, 2M; 2--- 2>Mn 2 Man = 0 


be any central chain of ideals; then since by (2B) M,, is a central ideal, we have 
M, |Z. Let us assume that we know that M,,_:42 & Zi-1 and show that 
this implies that Mn—i41 © Z;. We have Mn_isio R & Mn ise & Zia and 

4 If the elements of R are well ordered, the assumption of the maximal condition is not 


necessary for the definition of the upper central chain. 
5 A central ideal is an ideal contained in the center. 





~~ wee ry 


ve 
at 
od 


ot 








CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 347 


hence M,,-;4; belongs to the center of R modulo Z,_; and hence is contained 
in Z;. We have proved that 

(4B) Zi = M n~i41 (t= R 2, ees ). 
It follows that the length, c’, of the upper central chain is less than or equal to 


the length of any central chain, and hence, by (2.1), must equal the class c of 
the ring. We have proved, therefore, the following 


THEoreM 4.1. Jf R is a ring of finite class with ascending chain condition for 
two sided ideals, the upper central chain 


R = Z, at Bent at *** 224,22, =0 


exists and has the property that, if 


R= M, 2 M2 2-:: 2Mn 2 May = 0 
is any central chain of R, then 
Zi = M m—i41 (t = |B 2, 7a ). 


In particular, Z; 2 Hein. 

It will be convenient to define Z; = 0 fori = —1, —2,---. Then using 
(3.1) and (3.2) we have at once the following two theorems. 

TuHeoreM 4.2. Ay-Z; G Z3-141325°Hi G Zj-i41. In particular, H;-Z.4 = 
Zi-1° 1; = 0. 

THeorEM 4.3. H;o Z; © Z;:. In particular, H;o Z; = 0. 

The following result illustrates the close analogy between the properties of the 


upper central chain and those of the upper central series of a group (ef. [7], 
p. 108, Theorem 14). 


THEoREM 4.4. If the ideal N is contained in Z;4: , but not contained in Z; , 

then the chain 
NOINNZINNZAD---DNNZDO 

is a central chain of N, and is strictly decreasing. 

Proof. The chain is obviously a central chain: we must show that it is 
strictly decreasing. We have 

RoN SNN (Ro Zins) GNNZ;. 

Since N is not contained in Z; , Ro N is not contained in Z;_; and hence N / Z; 
is not contained in N NM Z;-; ; repeating this argument using N | Z;, instead 


of N, etc., we see that the chain is strictly decreasing. 
The following is an immediate consequence of (4.4). 


TuHEeoreM 4.5. Any ideal of a ring of finite class contains elements of the 
maximal central ideal Z, . 








348 S. A. JENNINGS 


It is convenient at this point to mention the notion of centralisor ideal. Let 
A be any ideal of R; then the centralisor ideal of A, C(A), may be defined as the 
maximal ideal of R such that C(A)o A = 0. If B is any ideal such that 
Ao B = 0, then B ¢ C(A), and alternatively we may define C(A) as the union 
of all such ideals. The maximal central ideal Z, could be defined as the cen- 
tralisor of R, and similarly for the other terms of the upper central chain. 


5. Solvable rings. We have defined the derived ring Ro R of the given ring 
R, and have seen that it is the intersection of all ideals of R whose factor rings 
are commutative. It is convenient to denote the derived ring by R“’, and we 
set R = R. We may form the derived ring of R™ , which we denote by R™ 
and in general form R‘"’, the derived ring of R‘“”. We obtain in thiseway a 
chain of subrings 

R= Rk” DBR” DR” 3B--- DR” DB: 


such that R‘” is the derived ring of R‘"~” (n = 1,2, +--+). If for some integer k 
we have R“’ = 0 the ring R is said to be solvable and we call the chain 
(5A) R= R° DR® DR® D--- DR” =0 


n—1) 


the derived chain of R. We note that while R‘”’ is an ideal of R‘”””, it may not 
be an ideal of R. 
We prove first the following 
THEOREM 5.1. A necessary and sufficient condition that the ring R be solvable 
is that there exist a chain of subrings of R 
R = Ao = A; = A, = o 29 = Aw-1 = An = 0 


terminating with zero, such that 

(1) A, ts an ideal of Ap-1 (p = 1, 2, --- ), 

(2) Ap1/A, ts commutative (p = 1, 2, +--+: ). 

Proof. The necessity follows since the derived chain of a solvable ring satisfies 
(1) and (2). Conversely, the conditions are sufficient, because, using (2.3), we 
see that A, 2 R” (p = 1, 2, --- ) and hence R'” = 0, if An = 0. 

It follows at once that if R is a ring with a central chain of ideals then these 
ideals satisfy (5.1), (1) and (5.1), (2), and we have, therefore, 

TuHeoreM 5.2. Every ring of finite class is solvable. 

We prove now a result due to Jacobson [3] which we shall need later. 


Lemma 5.3. Let R be any associative ring and N a nilpotent subring of R which 
is closed under commutation with elements of R, that is, if xe R, neN, then 
toneN. Then R-N UN isa nilpotent ideal of R. 

Proof. Itis clear that VN = RN U N isan ideal since NR GC RNUN. Con- 
_ sider 
N’ = (RN U N)(RN UN) 

S (RNRN) U (NRN) U (RN’) U N’. 





Let 
the 
chat 
Lion 
-en- 


ring 
ings 

we 
R' 2) 
ya 


or k 


not 


able 


sfies 
we 


lese 


hich 
then 











CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 349 


Now since N is closed under commutation, 
NRN GS RN-UN 


and hence 


N? © RN’ UN’. 


Similarly 
N CRN UM 


and hence, since N“ = 0 for some k, N* = 0, which proves that N is nilpotent, 
as required. 

We need also the following 

Lemma 5.4. Let C be a commutative ideal of R. Then (Ro C)-C = 0, and in 
particular, (Ro C)-(Re C) = 0, that is, (Ro C) is nilpotent. 

Proof. If C is commutative, then Co C = 0, and hence, if c, c’ are any two 
elements of C, and x e R, we have xc e C and therefore 


(xc)oc’ = 0. 


However, (xc) o c’ = x(co c’) + (xo c’)e, and hence (xo c’)e = 0, which shows 
that (Ro C)-C = 0. Since C > Ro C, we have (Ro C)-(Ro C) = 0, that is, 
Ro C is nilpotent. 

We are now in a position to prove the principal result of this section, which is 

THEeorEM 5.5. If R is a solvable ring, its derived ring is nilpotent. 

Proof. Suppose first that R is such that R® = 0. Then R® is a commu- 
tative ideal of R. We form Ro R = S, say. If S = 0, then R is a ring of 
class 2, and hence R“’(= H2(R)) is nilpotent by (3.9). If S #¥ 0, then by 
(5.4) S is a nilpotent ideal of R. Moreover, R/S is of class 2, and hence 
R” /S(= H.(R/S)) is nilpotent, and since both R/S and S are nilpotent, so 
is R” , which proves our theorem for the case that R® = 0. 

Suppose now that we have 

R= Rk DR D--- DR” =0 


with k > 2. We proceed by induction and suppose (5.5) true for rings with 
derived chain of length less than k. In particular, (5.5) is true for the ring R” 
whose derived chain is of length k — 1, and hence R™ is a nilpotent ideal of R”’, 
and hence is a nilpotent subring of R. We show first that R® is closed under 
commutation with elements of R; R™ is closed under commutation since it is an 
ideal of R. Any element of R® may be written as a sum of elements of one or 
both of the types 
ao bi, €1(a; o by), 


(1) . 
where a; , 6; , ¢; are elements of R°’. Consider (a;0 b;)o x = dio (bho 2) + 
° 1) - ° . . qd 
bio (xo a); since R“”’ is an ideal, bj o x and zo a, are again in R*’, and hence 
(a,0 b;)o x isin R™. Similarly, since 


[c;(ay ° b;)] ort = Cy[(a, ° b;) ° x] + (c; 0 x)(a ° b;) 








350 S. A. JENNINGS 


’ 2 1) . . 2) (2 
and (a;0 b))o re R® and (c,0 x) eR, [e;(a,0 b;)]o x is in R™ and hence R® 
is closed under commutation. 

We form the ideal 


M = R® U RR” 


of R. Since R™ is nilpotent and closed under commutation, by (5.3) M is a 
nilpotent ideal of R. Consider R/M; its second derived ring is zero, since 
R® © M, and hence R“’/M is nilpotent by our first case. It follows that R® 
is nilpotent since R“’/M and M are, which proves our theorem. 

An equivalent statement to the above is 

THEOREM 5.6. <A necessary and sufficient condition that a ring R be solvable is 
that R contain a nilpotent ideal N such that R/N is commutative. In particular, 
if R has a radical, it is necessary and sufficient that R be commutative modulo its 
radical. 

Proof. The necessity of the condition follows; to see that it is sufficient we 
use (5.1). Suppose that N is a nilpotent ideal of R. We form the powers of N, 
which are ideals of R, and consider the chain 


(5B) RON ON D--- DN FO. 


Now R/N is commutative, and since N'/N*™ is commutative, i = 1, 2, ---, 
by (5.1) R is solvable. 

For some purposes it is inconvenient that the members of the derived chain 
of a solvable ring are not in general ideals of the ring. We see that it is possible 
to overcome this difficulty. For a given ring we form a chain of ideals: 


(5C) R=R2R2R2::-2R,2:::, 


where R, = Ro R, and in general R, = Ry-10 R,-1, the commutator ideal 
R,-1° R,-, being formed in this case with respect to the whole ring R, and 
hence R,, being an ideal of R. Alternatively we may define FR, as the inter- 
section of all ideals S of R contained in R,_,; such that R,_,/S is commutative. 
Now if R is solvable, (5B) is a chain of ideals of R such that N' > R; and hence 
Ry4; = 0, and therefore the chain (5C) terminates with the zero ideal. Con- 
versely, if (5C) terminates with zero by (5.1) R is solvable. 

We embody these remarks in 

TueEoreM 5.7. A ring R is solvable if and only if there exist chains of ideals of R 

R = So > 8; > & =: > Sn = 0 

such that S,1/S, is commutative, p = 1, 2, +++ ,m. 

The following condition for a ring to be solvable is obtained from (5.3). 


_ Tueorem 5.8. Let C be the subring of R generated by all elements of the form 
zo y, where x, yeR. A necessary and sufficient condition that R be solvable is 
that C be nilpotent. 





tl 


we 


rm 








CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 351 


Proof. The condition is necessary since C ¢ R, and if R is solvable, R™ 
is nilpotent. Conversely, suppose C is nilpotent; from its definition C is closed 
under commutation with elements of R, and hence by (5.3), the ideal 


M=CURC 


is nilpotent. Moreover, R/M is clearly commutative, and hence by (5.6) R is 
solvable. 


6. The associated Lie ring. We call the Lie ring formed from a given asso- 
ciative ring R by combining the elements of R under addition and commutation, 
where the commutator xo y of two elements x, y ¢ R is defined as xy — yz, the 
> ° ° ° 6 7 ° ° ° 
Lie ring associated with R.. We shall denote this Lie ring by ; we may remark 
that there is no confusion in supposing that R and % have the same elements, 
and we shall use x o y to denote the expression zy — yx in R as before. We are 
concerned in this section with relations between the properties of R and %. 
We recall that for any two ideals 9, 8 of an arbitrary Lie ring 2 we may form 
the product [%, B] consisting of all elements of the form a o b, where a e Y, 
be %. The product [%M, B] is again an ideal of 2. A Lie ring & is said to be 
nilpotent if the chain 
Q=BWIOIWwWIBZD--- DW DBu = 0 

terminates with zero, where %, = [%,-1, tJ. A Lie ring & is solvable if the chain 
g _ g@ > 9 > = > e™ os 0, 

sini oa : } 
where @ = [¢°-?, @°°?] terminates with zero. 
Suppose now that F is a solvable ring, with derived chain 
R _ R© > R® eer > R” = 
4(2) 


Then if a; is any element of ®”, it is clear that a; « R“, and hence R® = 0, 
that is, % is a solvable Lie ring. Similarly if R is of finite class and has the 


> * 


lower central chain 
R=H,>48,2>::-: DH, > Aw: = 0, 


then if b, is any element of R, , b, e H, , and hence %.4; = 0, that is, R is nil- 
potent, and its class is at most equal to the class of R. 
We-may state, therefore, 


TuroreM 6.1. Jf R is a solvable ring with R® = 0, then the associated Lie 
ring KR is solvable, and R® = 0. 


THEeoreM 6.2. If R is of finite class c, then the associated Lie ring R is nil- 
potent and has class at most c. 


6 For a brief discussion of some of the properties of Lie algebras and Lie rings, and of 
the Lie ring associated with a given associative ring, cf. [2], [3], [4]. 








352 S. A. JENNINGS 


The converse problem to the above, namely, knowing tht ® is solvable, or 
nilpotent, to draw conclusions regarding R, is more difficult. For a general 
ring, our only result along this line is embodied in (6.5), but for the special 
case when F# is an algebra, we prove that, provided the characteristic of the 
underlying field is not two, the solvability of the associated Lie algebra implies 
the corresponding property of the original associative algebra, while if the 
associated Lie algebra is nilpotent, the original algebra is of finite class, what- 
ever the characteristic. 

We proceed to prove, first,’ 

THEOREM 6.3. Let R be an algebra over a field &. If the characteristic of 
is not two, then R is solvable if R is solvable. There are non-solvable algebras over 
any field of characteristic two whose associated Lie algebras are solvable. 

Proof. Let N be the radical of R, and set S = R/N. Because of (5.6), it will 
suffice to show that S is commutative. In what follows, if A is any algebra we 
denote its associated Lie algebra by A,;. Now if R, is a solvable Lie algebra, 
so is S;. Moreover, S is semi-simple and if 


S = A; ® A; @ A; ® gaat: @® A, 
is the splitting of S into simple components, then 
S; = (A): >) (As): ® (As), ®--- @ (Ai): 


and (A,), is solvable, i = 1, 2, --- , ¢ We may therefore reduce our problem 
to the case of a simple algebra A, such that A, is a solvable Lie algebra, and we 
wish to prove that A is commutative. Now let = be the center of A, and 
consider the algebra (A over =). Clearly A; is solvable if and only if (A over =), 
is, and hence we may further reduce the problem to the case where A is a nor- 
mal simple algebra over the field =. Let Q be the algebraic closure of = and 
consider (A over 2); we know that 


(A over 2) = Q,, 


where ©, is a complete matric algebra of degree n over 2, where n is the rank 
of A over =. Now (A over Q); is solvable, since (A over 2), is and hence (Q,), 
must be solvable. The structure of (©,), is known, viz. [2], pp. 216-217. If 
the characteristic of 2 (and hence of @) is not two, (2,); is solvable if and only 
if n = 1, and if the characteristic is two, if and only if n = 1 or 2. We see, 
therefore, that a simple algebra has a solvable associated Lie algebra if and only 
if it is of rank one over its center, that is, if it is commutative, provided the 
characteristic of the field @ is not two, which proves the first part of our theorem. 

Direct calculation shows that the complete matric algebra , of degree 2 over 
any field @ of characteristic 2 is not solvable, and indeed is such that (@2) o (@,) = 


7 I am indebted to the referee for the simple proof of Theorem 6.3 which follows, and 
particularly for pointing out to me the exceptional case which arises when the characteristic 
of the underlying field is 2. 








2, Or 
eral 
cial 

the 
Dlies 

the 
hat- 


of ® 


over 


will 
L we 
bra, 








CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 353 


@,. However, (#2), is solvable. It is readily verified that (#.), , while solvable, 
is not a nilpotent Lie algebra and hence in the above, if the condition that the 
Lie algebra (A), be solvable is replaced by the stronger condition that it be nil- 
potent, then we may conclude that n = 1 even in the case of characteristic two, 
and hence A coincides with its center. 

We have proved, therefore, 

THEOREM 6.4. Any semi-simple algebra whose associated Lie algebra is nil- 
potent is commutative. 

In general, little seems to be known of the conditions which must be imposed 
on an associative ring R in addition to the condition that ® be solvable (or 
nilpotent) in order to ensure that R enjoy the corresponding property. We 
prove here that if R is already known to be solvable, then the nilpotency of R 
implies that # is of finite class. 

THEOREM 6.5. A solvable ring R is of finite class if and only if its associated 


Lie ring & is nilpotent. 


Proof. We have seen that if R has finite class, so has 8. Conversely, 
suppose that 3t has class y; if y = 1, R is commutative, and hence has class 1, 
and our theorem is true. Suppose, therefore, that our theorem holds if 9% has 
class at most y — 1. Let C be the subring of R generated by those elements 
of R which belong to 8, ; then since all such elements belong to the center of R, 
cox = 0, where ceC and xe R. Hence C is closed under commutation, and 
since C e R”’, and R” is nilpotent by (5.6), C is nilpotent, and hence, by (5.3) 


M=CURC 


is a nilpotent ideal of R. Consider R/M; it is a solvable ring, and the class of 
its associated Lie ring is at most y — 1; hence by our induction R/M is of finite 
class, and therefore for some integer p, H,(R) ( M. Our theorem will be 
proved if we show that, for some integer a, 


P, = (+++ ((MoR)oR)o--+) =0, 


where the above expression is obtained by commuting M with R o times, for 


then H,,.(R) ¢ P., = 0. 

We consider, therefore, Mo R. Let m’ be any element of M and ze R; if 
m' «eC, then m’o x = 0. If m’ €e RC, m’ may be written as a sum of elements 
of the form m = yc, where y eR, c eC, and we have 


mox = (yco x) = (yo x)c 


since c is in the center of R. It follows that Mo R & H2-C, where H2 = H2(R). 
If Mo R # 0, consider (Mo R) o R; we see similarly that (Mo R)o R & H;3-C 
and in general 


Fe .. H41°C. 








354 S. A. JENNINGS 


Now we know that H, € M, and hence 
Pai SM-CoM CRCUC. 


Similarly P, © H.-C’, and if we continue in this way we see, since C is nilpotent, 
that there exists an integer o such that P, = 0, and our theorem is proved. 

We are now in a position to prove that every algebra whose associated Lie 
ring is nilpotent is of finite class. Let R be an algebra such that 9 is nilpotent. 
Then by (6.4) R/N is commutative, where N is the radical of R, and hence R 
is solvable, by (5.6). Hence we see, from (6.5), that R is of finite class. We 
have proved, therefore, 

THEorEM 6.6. Let R be an algebra, and § its associated Lie algebra. Then R 
is of finite class if and only if R is nilpotent. 


In the case of algebras, therefore, our concept of finite class is equivalent to 
the nilpotency of the associated Lie algebra. 


7. Solvable nil-rings. A ring RF is said to be a nil-ring if for any xe FR there 
exists an integer p (which may depend on zx) such that z® = 0. Any nilpotent 
ring is also a nil-ring, but examples are known of nil-rings which are not nilpotent. 

We need the following lemma. 


LemMa 7.1. A commutative nil-ring generated by a finite number of elements is 
nilpotent. 

Proof. If R is commutative, and is generated by a finite number of elements; 
say a, 42, ---,a,, then every element of R may be written as a sum of elements 
of the form 


af'-ag?. --- -at (a; = 0; = 1,2,---,7). 
If R is a nil-ring, there exist positive integers 8; , B2, --- , 8, so that 
até = 0 (¢ = 1,2, ---,7r). 


It follows at once that if o = yb 8; , then R’ = 0, and R is nilpotent. 

In connection with (7.1) we may remark that some condition similar to that 
in the lemma is essential, since it is easy to construct a nil-ring with an infinite 
generating set, which is commutative but not nilpotent. 

The following theorem may now be proved. 


TuHEeoreM 7.2. Let R be a nil-ring generated by a finite number of elements. A 
necessary and sufficient condition that R be nilpotent is that R be solvable. 


Proof. If R is solvable, R“ is a nilpotent ideal of R and R/R” is commu- 
tative, by (5.5). The factor ring R/R”, however, is a nil-ring generated by a 
finite number of elements (since R is) and hence by (7.1) is nilpotent. Since 
R/R™ and R™ are nilpotent, so is R. 








tent, 


| Lie 
tent. 
ce R 

We 


en R 


it to 


here 
tent 
tent. 


us is 


ants, 
ents 


» the 








6. 


~ 


CENTRAL CHAINS OF IDEALS IN AN ASSOCIATIVE RING 355 


BIBLIOGRAPHY 


._ P. Hatt, A contribution to the theory of groups of prime-power orders, Proceedings of the 
Yy Of grouy } P g 


London Mathematical Society, (2), vol. 36(1933-1934), pp. 29-95. 

N. Jacosson, Abstract derivation and Lie algebras, Transactions of the American Mathe- 
matical Society, vol. 42(1937), pp. 206-224. 

N. Jacospson, Rational methods in the theory of Lie algebras, Annals of Mathematics, vol. 
36(1935), pp. 875-881. 

N. Jacosson, Restricted Lie algebras of characteristic p, Transactions of the American 
Mathematical Society, vol. 50(1941), pp. 15-25. 


.W. Maenus, Uber Gruppen und zugeordnete Liesche Ringe, Journal fiir die reine und 


angewandte Mathematik, vol. 182(1940), pp. 142-149. 

H. ZassennHAUS, Endliche p-Gruppe und Lie-Ring mit der Characteristik p, Abhandlungen 
aus dem mathematischen Seminar der Hansischen Universitit, vol. 13(1939), 
pp. 200-206. 


. H. Zassennaus, Lehrbuch der Gruppentheorie, I, Leipzig and Berlin, 1937. 


Tue UNIVERSITY OF BritTIsH COLUMBIA. 





GENERALIZED “SANDWICH” THEOREMS 


By A. H. Srone anp J. W. TuKEY 


The following theorem is well known under the self-explanatory name of 


y . 99 1 
the “ham sandwich theorem’’. 


Given any three sets in space, each of finite outer Lebesgue measure (m*), there 
exists a plane which bisects all three sets, in the sense that the part of each set 
which lies on one side of the plane has the same outer measure as the part 
of the same set which lies on the other side of the plane. 

The usual proof is based on the following theorem of Borsuk.’ 


If @ is a continuous mapping of the n-sphere S” in Euclidean n-space R" which 
is “antipodal” (i.c., diametrically opposite points of S" map into points symmetric 
about the origin in R"), then there ts a point of S" which maps into the origin of R”". 


If now p denotes a plane in R*, let p* and p” denote the two parts into which 
p divides R’, and let » be the unit-vector perpendicular to p, oriented from p 
top. Let A; (¢ = 1, 2, 3) be the given sets. The usual argument proves 
first, from measure-theoretic considerations, that for each v a corresponding p 
can be found, depending continuously on v, which bisects A;. The correspon- 


dence ¢(v) = |[m*(p*-A,) — m*(p +A), m*(p"-As) — m*(p -Ao)] is then an 
antipodal mapping of Sin R°. The result now follows from the case n = 2 
of the Borsuk theorem (which can, for n = 2, be proved readily ab initio). 


Now, a fuller use of the Borsuk theorem gives an easier proof of a more 
general theorem. Let R be any point-set on which a Carathéodory outer 
measure m* is defined. Let f be a real-valued function defined over S" X R 
such that: 

(1) For each A ¢ S", f(A, x) is a measurable function over FR (x « R), and vanishes 
only over a set of measure zero. 

(2) For each x ¢ R, f(A, x) is a continuous function over S 

(3) For each pair of diametrically opposite points A and —A of 8’, 
f(A, x)-f(— A, x) S 0 almost everywhere in R. 

Write f(A), f (A), andf (A) respectively for the subsets of R on which f(A, x) > 
0,=0,and <0. We say “f'(A) bisects A C R” if m*(f"(A)-A) = m*(f (A)-A). 


THEOREM. Given any n sets A; , As, +++ , An in R, each of finite outer measure, 


n 


there exists A « S" such that f'(A) bisects each A; (i = 1, 2, --- , n). 


Proof. Define a mapping ¢ of S” in R" by: 
(4) The 7-th coordinate of @(A) is m*(f*(A)-Ai) — m*(f (A)-A)). 


Received November 12, 1941. 

1 Discovered by S. Ulam; we are indebted to the referee for calling this fact to 
our attention. 

2 Equivalent to Satz II of Drei Sdtze wiber die n-dimensionale euklidische Sphdére, Funda- 
menta Mathematicae, vol. 20(1933), p. 177. This theorem was suggested by Ulam. 


356 








ore 


iter 


CR 


shes 


ure, 


nda- 











GENERALIZED “SANDWICH’’ THEOREMS 357 


Clearly ¢ is antipodal; for, in virtue of (3) and (1), f*(A) = f-(—A) to within 
sets of measure zero. 

Also ¢ is continuous. For let {A,} — Ao in S”. From (2), f(A) © lim inf 
f'(A,). Thus, since the sets Eh. are measurable and A; has finite outer 
measure, we have m*( A Ao): A;) S lim inf m*(f"(A,)-A,). On the other hand, 
lim sup f° ( “4 Cf" (Ao) + Fs ei from (2). Hence lim sup wv A,):Ai) S 
m*(f"(Ao)-Ai) + 0, using (1). Whence m*(f"(A,)-Ai) — m*(f" (Ao)-A,) as 
p— ©. 7 similar argument applies tof. This establishes the continuity of ¢. 

Hence Borsuk’s theorem yields the existence of A e S” such that (A) 

(0, 0, --- , 0); that is, such that f’(A) bisects each A; . 


COROLLARY. Given n + 1 measurable functions fo, --- , fn over R, and n sets 
A,,:+-,A,inR, of finite outer measure, then, provided that fo, --- , f, are linearly 
independent modulo sets of measure zero (i.e., that whenever rafal *) + +++ + 
Anfn(x) = 0 over a subset of R of positive measure, then » = i = - = 0), there 
will exist real numbers Xo, ++ , An, not all zero, such that each A; ts bisected by the 
sets defined by Nofo(x) + +--+ + Anfn(x) < O and Nofo(x) +--+ + Anfn(x) > O. 


For we can take S" as the unit sphere in R"™’, and then put f(A, x) = Aofo(x) + 
Aifi(a) + +++ + Anfn(z), where A is (Xo, AL, *** 4 An)- 

This corollary plainly includes the “ham sandwich theorem”; we need only 
take fo = 1, and f, = x, fe = y, fs = z, where x, y, z are the coordinates in R® = R. 
It also includes such results as the following. Any n + 1 sets in R", of finite 
outer measure, can be bisected by an (n — 1)-sphere, in the sense that the part 
of each set which lies inside the sphere has the same outer measure as the part 
which lies outside the sphere. (A plane is here regarded as a sphere of infinite 
radius.) For we can take fy = mtateee + anti = z,(¢ = 1,2, --- , 2), 
and fri: = 1, where x , 2, +++ , 2, are the coordinates in R” = R. 

Similarly, any five sets in the plane, of finite outer measure, can be bisected 
by a conic; and so on. 

Thus, roughly speaking, to require that a given subset of R” be bisected by 
one of a family of algebraic manifolds is to impose a linear condition. This is 
not true if bisection is replaced by division into other given ratios. In fact: 


° " 3 P 
If co , a are numbers such that, given any two sets A; , Az in R’, of finite measure, 
there exists a plane which divides A; in the ratio a; :1 — a; (i = 1, 2); then a = 


For we can obviously suppose that 0 S a S a2 S 1; and it is easily seen that 
a ~ 0 and a #1. Taking A; = Az, we see that a, = a. Take A; to bea 
small “‘solid’’ sphere and Az to be a large concentric one; then any plane which 
divides A, in the right ratio must meet A; and so will approximately bisect A» . 
Thus a; = a, = }. 

The extension to two sets in R” is immediate. 

In R', however, there is a remarkable analogue. 








358 A. H. STONE AND J. W. TUKEY 


A necessary and sufficient condition that the real numbers oy , a2 be such that, 
given any two sets A; , As in R’, of finite outer measure, there exists an interval in R' 
whose intersection with A; has for outer measure a; -m*(A,) (¢ = 1, 2) is that a, = 
a, = the reciprocal of an integer greater than 1.° 


(An “interval” here is either finite or half-infinite, i.e., is of the form (a, 6) 
or (— «, b) or (a, ~).) 

Proof of necessity. It is easily seen, as above, that 0 < a, = a2 < 1. Let 
a = a = a, say; then if a has not the form 1/n, where n is an integer greater 
than 1, we can write 


(5) 1/(n + 1) < @ < 1/n, where n is a positive integer. 


Let 0 < p < 1, and denote by E, the interval (n — p,n +p). Take A; = 
BE, + Es + «++ + Eeng, and Ao = Ey + Ey + --- + Eon. Any interval J 
which satisfies m(I-A,) = a-m(A;) must (since a > 1/(n + 1)) meet two con- 
secutive E’s of odd suffix. Hence J contains an E of even suffix; so m(I-Az) 2 
(1/n)-m(Azg) > a-m(Ap2). 

Proof of sufficiency. Let a; = a2 = 1/n, where n is an integer greater than 1. 


We can take points 1; < 2. < --+ < 2,-, eR’ such that, if J; denotes the interval 
(x; , iar) (@ = 0,1, --- , m — 1; % is written for — ©, and z, for +), then 
m*(Ig -Ay) = m*(I, -Ay) = «++ = m*(In_-1 -A1) = m*(A;)/n. Now, we have 


m*(Ip -As) + m*(I, -As) + +--+ + m*(In-1+As) = m*(A2). Hence either 
there is an 7 for which m*(I; -A2) = m*(As)/n—in which case J; is the required 
interval—or there are i, j such that m*(I; -As) < m*(A2)/n < m*(I; -Ao). 
So, by an easy continuity argument, there will be an interval J (with its left- 
hand end-point between z; and x; and right-hand end-point between 2z;4; and 
2441) for which m*(J-A;) = m*(A;)/n and m*(J-As) = m*(A2)/n; and J is 
the required interval. 

Remark. If we interpret “interval” to mean “finite interval’’, it is easily 
seen that the corresponding necessary and sufficient condition on a, age is: 
a, = a: = the reciprocal of an integer greater than 2. 

The preceding result has an analogue in R’. For let (1/n) > a > 1/(n + 1), 
and consider the following two sets: As = the “annulus” between two con- 
centric similarly situated regular n-gons of sides 1 and 1 + 6 (where 6 is smal] 
and positive) and A; = a set of n + 1 equal small circles, n of which are in- 
scribed in A, at its corners, and the last of which is concentric with Az. It will 
readily be verified that any circle‘ which cuts off a times the measure of Ai 
from A; must cut off nearly 1/n (at least) of As in measure if 6 is small. Con- 


3 This can also be deduced from a theorem of P. Lévy, Généralisation du théoréme de Rolle, 
C. R. Acad. Sei., vol. 198(1934), pp. 424-425. See also H. Hopf, Uber die Sehnen ebener 
Kontinuen und die Schleifen geschlossener Wege, Commentarii Mathematici Helvetici, 
vol. 9(1937), pp. 303-319. 

‘ A half-plane (determined by a straight line) is regarded as a circle. 





that, 
n R' 


4 = 


1, b) 


Let 
ater 








GENERALIZED “SANDWICH” THEOREMS 359 


sequently, a necessary condition that the real numbers a , a2 be such that, given 
any two sets A, , Az in R’, of finite measure, there exists a circle which intersects A; 
in a set of measure a; -m(A;) (t = 1, 2) is that a, = a2 = the reciprocal of an 
integer greater than 1. 

It is plausible that this condition is also sufficient, but we are unable to 
prove this. In the case a; = a, = }, the sufficiency follows from the “ham 
sandwich theorem’’. 


Tue InstiTuTE ror ADVANCED Srupy AND PRINCETON UNIVERSITY. 








THE CONTINUED FRACTION AS A SEQUENCE OF LINEAR 
TRANSFORMATIONS 


By J. Frnptay Paypon anp H. 8. WALL 


1. Introduction. This paper contains a development of properties of the 
continued fraction 


_ 


(1.1) . 

it iveti + i + 
in which the elements a2, a3, @,, --- are complex numbers. The central idea 
may be described as follows. With the continued fraction (1.1) we associate 


° ° 1 
a sequence of linear transformations 


l 
(1.2) a(v) = v, a,(v) = - (-k = 2 3,4 
, 1+ qv tial 
Then the product of the first n of these is 
1 ae Baus GnV Gn And + Ant 


(1.3) Md. ++: a,(v) = rairtre =< < i a hk.’ 
where A, and B, are the k-th numerator and denominator of (1.1), i.e., A, = 1, 
B i= 0, Ae = 0, By = ]. Ak = Ari + a;Ax-_2 , B,. > By + a,.By-2 (ke = ‘. 
2, 3, --- ; a = 1). Corresponding to an arbitrary set V of points in the 
complex z-plane, called a value region, we determine a set @ of points, called 
the element region (corresponding to V), by the condition that a is in @ if and 
only if the transformation w = a(v) = 1/(1 + av) transforms V into a subset of 
itself, i.e.,a(V) CV. It is at once evident that if a2, a3, a4, --- are in @, and 


Qe eee a,(V) = y” (n = 1, 2, 3, see), 


then 
va y"say"ar"S 


° - e y(1) 7y(2 7(3) 
Hence, if V is a bounded closed set so that V“’, V“, V’, --- are closed, then 
, ’ 
there are just two cases, namely: 
Case I. The sets V’"’ (n = 1, 2, 3, ---) have one and only one point, v , 


in common. 
Case ll. Thesets V‘’” (n = 1, 2,3, ---) have two or more points in common. 
In Case I, we have, uniformly for all v in V: 
lim a@,@2 --+ a,(v) = %. 
n= 
Received November 12, 1941. Part of this paper was presented to the American Mathe- 
matical Society by J. Findlay Paydon, September 2, 1941, under the title Convergence 
regions and value regions for continued fractions, and part by J. Findlay Paydon and H. S. 
Wall, December 29, 1941, under the title An extension of the Stieltjes continued fraction theory. 
1 It is convenient to use the symbol a, in two senses. The subscript 1 will be reserved 
for the identity transformation. 


360 








the 


idea 
eiate 


= 1, 
= 1, 
the 
alled 
and 
et of 
and 


-), 


then 
» Yo, 


non. 


athe- 
gence 
H.S. 
ory. 
erved 








CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 361 


Hence it follows that if V contains 0 or 1, the continued fraction converges 
to vp inasmuch as A,1/Br1 = daz --- a,(0) and A,/B, = aidz --- a,(1). In 
Case II the continued fraction is in general divergent, and in Case I the con- 
tinued fraction may diverge if V does not contain 0 or 1.” 

We take for value region V = V, the circle | z — c¢| = | c| and its interior, 
where 3t(c) > 4, and determine the corresponding element region @ = @, to 
be a parabola and its interior, the parabola having its focus at the origin and 
not having the point —} on the interior. If a is an arbitrary point not in the 
interval [R(a) S —}, X(a) = 0], then ¢ can be found such that R(c) > 4, and 
such that a lies within @.. We then show in particular that (1.1) converges if 
a2, @3, 44, --- lie in any bounded closed region within (@, . 

In §2 we determine the element region (?, corresponding to the value region 
V. defined above. In §3 we consider in detail the case where c is real, and 
arrive at new proofs of the “parabola theorem”’ and the ‘‘parabola-circle the- 
orem” of Scott and Wall.’ In §4 there is a determination of the value region 
in case the element region is |z| < r S } (which is a subregion of @). The 
main result is contained in §5, and may be described as an extension of the 
Stieltjes‘ convergence theorem. 


If az, a3, 44, «+ are in the horizontal strip —3h S y S ++4h in the plane of 
z= 2+ iy, where 0 < h S 1, then the continued fraction 


(1.4) -_ —,— 
converges uniformly over any bounded closed region lying entirely within the cardioid 
. ] i 
(1.5) p = —~ (1 + cos 8), t= pe, 
2h? 


° . . 5 2 
provided the series >» b, | diverges, where by = 1, Dn4r = 1/brQnai (n = 1, 2, 
3,---). If the series > | b, | converges, then the sequences of even and odd approxi- 
mants converge to separate limits which are meromorphic functions of 1/t. 


The Stieltjes convergence theorem appears as the limiting case h = 0. The 
function represented by a continued fraction may have a singularity at any 
assigned point upon the cardioid, and therefore the result is in a certain sense 
the best. The concluding section of the paper contains a discussion of a class 
of continued fractions with elements in the unit circle. 


2 For example, if V consists of the single point ¢ (¢ ~ 0, 1), then (¢ contains the single 
point a = (1 — ¢)/t?, and the continued fraction diverges if a is real and less than —1/4. 

3W. T. Scott and H. S. Wall, (1) A convergence theorem for continued fractions, Trans- 
actions of the American Mathematical Society, vol. 47(1940), pp. 155-172; and (2) Value 
regions for continued fractions, Bulletin of the American Mathematical Society, vol. 47 
(1941), pp. 580-585. 

4T. J. Stieltjes, Recherches sur les fractions continues, Oeuvres, vol. II, 1918, pp. 402-566. 

5 This series is to be counted as divergent if some a, is 0. 








362 J. FINDLAY PAYDON AND H. S. WALL 


2. (V.,@.). Let c = r + is, where r, s are real and } < r < 1, and denote 
by V, the circle |z — c| = |c| and its interior. The element region (@, cor- 
responding to V, as value region consists of all points a such that the transforma- 
tion a(v) = 1/(1 + av) carries V, into a circular regionin V.. Puta = x + wy, 


w = X + iY. Then the transformation w = a(v) takes |v — c| = | c¢| into 
the circle 
(2.1) (X — a) + (Y — 8) = 7, 
where 

_ l+re— sy ae sx + ry ;_ (ze- sy)” + (sa + ry)’ 
“" T+ 2(rz — sy)’ 1 + 2(rx — sy)’ 7 [1 + 2(rxz — sy)? ~ 
This circle evidently passes through 1, and consequently will lie in the circle 
|z—c| =|c| if and only if its center lies within or upon the ellipse J: | z —¢ | + 


|z — 1| = |c|, which is the locus of centers of circles through 1 tangent to 
the circle |z — c| = |c|. The equation of J in rectangular coordinates X, 
, m 

(1 — 2r — s°)X* + 28(r — 1) XY — rY’* + (2° + 28° +r —1)X 


+ sY + (1 — 4r’ — 4s*)/4 = 0. 


Hence a = x + iy must satisfy the inequality obtained by putting X a, 
Y = @ and replacing “=” by “2” in the last equation. On doing this and 


simplifying we obtain the inequality: 


; Qrsx — (3° — r)y P a 1 — 2r |= + (° — ris | 1—2r P 
ed | r+ s? | ~ + s? r+ ” 2(r? + s*) | ° 


When a = x + ty satisfies this inequality, it is easy to see that the interior of 
(2.1) lies in the interior of V.. Hence we have proved that (t, consists of the points 
a = x + ty which satisfy (2.2). 

Inasmuch as interior points of V, are mapped into interior points of (2.1), 
and since 1 is interior to V., it follows that when a., a3, a,;, --- are in the 
parabolic region (2.2), then all the approximants a,a2 --- a,(1) of (1.1) are in 
the interior of V.. 

In order to throw (2.2) into a more convenient form, put 


x’ =xcos¢+ysing Pg — Drs 
cos ¢ = - 


—, sng@= — 
r? + 8?’ , 5): 





y’ = —xsing+ ycos¢ 


and (2.2) becomes y” < P (x + ‘) , where P = (2r — 1)/(r? + 8°). It will be 
observed that ¢@ = —2argce. The polar equation of the parabola bounding Q@, is 


P is 


we °© Bi cos =e)’ 








to 








CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 363 


This parabola never contains the point —} in its interior; it passes through 


—1, and therefore has maximum extent when r = 1, in which case (2.3) may be 
written: 


2 COS 3 


2.4) = 
( °" t= (@ — ¢) 

We shall summarize the main results of this section as 

THEOREM 2.1. If the elements az, a3, a4, -*- of (1.1) lie within or upon the 
parabola 
(2.5) |z| — R(z) cos d — Y(z) sind = 3P, 
where —4 << <2, P = (2r7—1)/(r + 8’), 3 <r S$ 1,8/r = —tan }¢, then 
all its approximants lie within the circle |z — c| = | c| wherec = r + is. 


3. (V-,@),4 <r2Z1. Whence = 7, the preceding theorem becomes: 


THEOREM 3.1. The element region (t, corresponding to the value region V,: 
le—r| Sr(4 <r S11) ts the parabolic region y Ss h(x + h/4),2=>2+7 
| ' - = I g Yy, 
where h = (2r — 1)/r’. If a2,a3,a4, +--+ arein(t,, then all the approximants of 


the continued fraction (1.1) lie in the interior of V, . 


We shall now obtain conditions on az, a3, a4, --- which are necessary and 
sufficient for Case I (cf. §1). Since @; > @, for } < r S 1, we may as well 
assume that r= 1. Let aya2 --- a,(Vi:) = V‘”; denote by K“” the circle bound- 


ing V‘"’; and let R‘” be the radius of K‘”. Then, on applying (1.3) to the 
circle K‘”, we find for R‘” the value 


(3.1) BP @ ogi (n = 1, 2,3, -*°). 


[B.? —|@n Bn? 


If some a, is 0, let a, be the first which is 0. Then, inasmuch as | B, |? — 


| a,By-2 |? = | Be |? = | Bes |’, and By, ¥ 0 because of the relation | A,B — 

Ay-2By1 | = | ded3 +++ Qe1| ¥ 0 and the fact that Ax_1/B,— is finite, it follows 

that R® = 0. Hence, also, R‘” = Oforn =k +1,k + 2, ---,s0 that V, 

Vv”, --» have one and only one point in common, and we therefore have Case I. 
We now use the relations 

(3.2) Buse = (1 + An+1 + On+2)B, — On On41By-2 (n = 2, 3, 4, rT *) 


to obtain an inequality for R‘” in case a, ~ 0 (n = 2, 3, 4, ---). Since the 
a,’s are in the parabc!:c region (@; so that | a, | = R(an) + h,/2, where 0 S h, 
<1 (n = 2,3, 4, ---), we readily find that | 1 + a.| > | a|,|1+a+4;| > 
las|,|1 + an + Gngi| = | Qn} + | Gnyi| (n = 3, 4,5, ---). Hence if b; = 1, 
baat = 1/bnQnas (n = 2, 3, 4, ---), we have: 


11 + bbe | > 1, |1 + bs + bebs| > 1, 
MDa + Ona + bass | 2. | Daas | + | Ba—s | (n = 3, 4, 5, ---). 


(3.3) 








364 J. FINDLAY PAYDON AND H. 8S. WALL 
If we put Q, = bybe --- b,B, , the first of these may be written 
(3.4) |Q@o| 21+ kK | bel, 1Q;| 21+ k | bs|, 


where k > 0. On making the same substitutions in (3.2) we get 


bQn 2 = (bnDn4idn 419 a b,, + bn42)Qn = bn+2Qn—2 (n = 2, 3, 4, ia: ), 
from which we obtain by means of (3.3) the inequalities: 
Danse € 
1 Qniz! — |Qn| = b {1 Qn] — | Que |} (n = 2, 3,4, ---), 


and consequently: 
(3.5) | Qn+e | — | Qn | P Kk | Daas | (n - 2, 3, 4, ree), 
By (3.4) this holds also if n = 0, 1. 
In terms of the Q,’s and b,’s our expression for R™ in (3.1) becomes: 
b ] 
R™ _ n| ig : 
1Qn| — | Qn-2| | Qn| + | Qn] 


and therefore, by (3.5) and (3.4), 


1 7— 

—-< > Ro < = ; 
2k (1 + & |x| 2k (1+ & S |bassl) 

fos = 

These inequalities together with the inequality R“ < R‘"~” show that if the 
series )> | b, | is divergent then R‘” + 0asn—«. On the other hand, when 
the series >> | b, | converges,” we know that the sequences of even and odd 
approximants converge to separate limits Ly) , L; , and therefore, inasmuch as 





(3.6) R°& < 


R” > A, ace Mans ; 
ji B, Baw 
we conclude that R‘” does not tend to 0 as n tends to ~ in this case. Since 
R® < 4 we must have | lo — L;| S 1. We have completed the proof of the 


following theorem. 


TueoreM 3.2. If V,, @, are as in Theorem 3.1, the following conditions are 
necessary and sufficient for Case I (ef. $1): (i) some a, is 0; or (ii) ag # O(n = 
2,3, 4, ---), and the series > | b, | diverges, where by = 1, Dna = 1/DnQagi (n = 
1, 2,3, ---). In Case I the continued fraction (1.1) converges and its value is in 
V.. In Case II (ef. §1), the continued fraction diverges by oscillation, the sequence 
of even approximants having a limit Lo and the sequence of odd approximants a 


limit L,. We have: Ly # Ii ; Io and L, are in V,;| Lo — In| S 1. 


6 Q. Perron, Die Lehre von den Kettenbriichen, second edition, Leipzig and Berlin, 1929, 
pp. 235-236. 














CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 365 


Scott and Wall’ showed that, if wis a value ¥ O upon the circle K, |z — 1] = 
1, then there is a continued fraction of the form 


‘a 1 a é&é@aé 4 
os) T+it+i+i+i+--’ 
where a lies on the boundary of (@, and 4 is the complex conjugate of a, which 
converges to the value w. We shall now prove that (3.7) is the only continued 
fraction with elements in (, whose value is w. More generally, we have the fol- 
lowing theorem. 

THEOREM 3.3. If a value region V is a region whose boundary is a circle K 
passing through the point 0 and containing 1 on the interior with corresponding 
element region (t, and if there is a continued fraction (1.1) with elements in @ 
which converges to a value w upon K, then there is only one such continued fraction , 


Proof. Since 1 is interior to V, all the approximants aja: --- a,(1) of (1.1) 
are interior to V so that no approximant can equal w. Let aya. --- a,(K) = 
K‘”. If (1.1) converges to the value w, then K‘”’ must be tangent to K at w; 

, 
and since @)@2 +--+ @,_;(1) = a2 - - - a,(0) lies on K*"’, it follows that a, is uniquely 
determined in terms of dz, @3, -**, @n-1. The theorem now follows by 


mathematical induction. 


4. The Worpitzky circle. Continued fractions of the form (1.1) whose ele- 
ments lie in the neighborhood of the origin are of particular importance in the 
applications to function theory. Worpitzky showed that (1.1) converges when 
the a,’s are in the circular region |z| < }. Since this region is contained in 
(t,, where (@; is the parabolic region of §3, the Worpitzky result is included in 
Theorem 3.2. Moreover, from Theorem 3.2 it follows that the value of (1.1) 
lies in V; when its elements are in the Worpitzky circle. We shall now obtain 
a better estimate for the value of (1.1) in this case. The result is as follows. 


THEOREM 4.1. If a2, a3, Q4, «++ lie within or upon the circle 





(4.1) \z2| = (2r — 1)/4r’ 4&4 <r s 2D), 
then all the approximants of the continued fraction (1.1) lie within the circle 
4r” 2r(2r — 1) 
4, - va, | 06 Sool 
(4.2) — 4r — | 


Theré is at least one continued fraction with elements within or upon (4.1) whose 
value is any preassigned number within or upon (4.2); and if w is any number 
upon (4.2) there is one and only one continued fraction with elements within or 
upon (4.1) whose value is w, namely: 


1 &@ (l1—2r)/4r° (1- 2r) /4r° 





(4.3) 


it1+ 1 + 1 + ++-’ 


7 Scott and Wall, footnote 3, (1), p. 166. 





366 J. FINDLAY PAYDON AND H. 8. WALL 


where, if 
nil 4r° + 2r(2r — Ie" 


1.4) 0s ¢@<2 

( og OZ ¢ wT), 
then 

ee _ (2r — 1) cos @ + (Sr? — 4r)(1 + cos d) + i(4r — 1) sing 


4r? 1 + (8r? — 4r)(1 + cos ¢) 


Proof. If a2 is in the circle (4.1), the value of the continued fraction (4.3) 
is w = 1/(1 + 2ra.); and as a2 ranges over the circle (4.1) and its interior, w 
ranges over the circle (4.2) and its interior. Hence there is at least one contin- 
ued fraction with elements in (4.1) which takes on any preassigned value in (4.2). 

To prove the first part of the theorem it suffices to show that if v is any value 
within the circle (4.2) then w = 1/(1 + av) is also within this circle for all 
values of a in the circle (4.1). Nowif vis fixed, we may consider w = 1/(1 + ag) 
as a transformation of a: into w; and we find that it carries the circle (4.1) 
into the circle with center and radius: 

167" _ — 4r°(2r — 1) f0| 


C= , = — a 
l6r* — (2r — 1)?|0/? l6r* — (2r — 1)?| v/* 


Inasmuch as this circle lies within the circle (4.2) when |v| < 2r, it follows 


' 


that every approximant of (1.1) lies within (4.2). 
Let w be any value upon the circle (4.2), expressed in the form (4.4). We 
may write w = 1/(1 + aw), where v is the value of a convergent continued 


fraction with elements in (4.1). We then have: 


l1— w 
av = 
w 
(4.6) 2 ; 
1 — 2r cos ¢ + (8r° — 4r)(1 + cos ¢) + i(4r — l1)sng 
2r 1 + (8r? — 4r)(1 + cos ¢) y 
and thus it follews that | av | = (2r — i)/2r, or | az | = [((2r — 1)/4r°]-(2r/ | v |). 
Consequently, inasmuch as |v | < 2r, we conclude that | a.| = (2r — 1)/4r’. 
But | a.| < (2r — 1)/4r’, by hypothesis. Therefore, 


| Ge | = (2r — 1) /4r’, |v | = 2r, v= 2r, 


where the last equation follows from the fact that v is in the circle (4.2). On 
putting v = 2r in (4.6) we then find that a, must have the value (4.5). Now, 
starting with »v = 2r as the value upon the circle (4.2) to be attained, we find 
in the same way that a; must be given by the expression in the right member 
of (4.6) divided by 2r, but with @ now equal to 0. This gives for a; the value 
. (1 — 2r)/4r’, and on repeating the argument we find that a,, as, --+ must 
all have this same value. We have completed the proof of Theorem 4.1. 








IWS 


We 


1ed 











CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 367 


A reexamination of the preceding proof will show that a stronger theorem 
holds for the case r = 1, namely: 

THEOREM 4.2. If | a2| < }, anda; ,a,, --+ arein the parabola | z| — R(z) = }, 
then all the approximants of the continued fraction (1.1) lie within the circle | z — 
(¢) | = 3. Among all the continued fractions (1.1) with | a2 | < } and az, a4, -- 
in the parabola, there is one and only one which takes on a prescribed value 
upon the circle. 


5. The main theorem. If a2, a3, --- are in the parabolic region Q@, of §3 
(} <r <1) andt = |t\e* (—x < ¢ < +n), then ant, ast, --- are in the parab- 
olic region (@. of §2 for an appropriate value of c, if and only if 


\¢| S =. + cos ¢), h? = (2r — 1)/r’. 
2h? 


Hence it follows that if a2, a3, --- are in the parabola 
y? = h(x + h’/4) (0 <h S 1), 


and ¢ is in the portion of the cardioid p = (1 + cos @)/2h inside the sector 
—2r+aZ6S +2 —a, where ais an arbitrarily small positive number, then all 
the approximants of the continued fraction 


5.1) Laat aot at 
I+i+i+i+-::: 


lie in a bounded region consisting of the interiors of two circles depending upon 
a. If Gis any closed region entirely within the cardioid, then a > 0 can be 
found so that G will lie in the specified portion of the cardioid. Consequently, 
the approximants of (5.1) are uniformly bounded over G. 

Now, when 0 < ¢ < 1/h’ we know that azt, ast, --- lie in the parabolic region 
(@%,, and consequently, by the “parabola theorem”, (5.1) converges for these 
real values of ¢ provided the series >}> | b, | diverges, where b} = 1, bay: = 
1/dnisb,n (n = 1, 2, 3, ---), whereas the sequences of even and odd approxi- 
mants converge to separate limits for these values of ¢ if >> | b, | converges. 
We may therefore apply the Stieltjes-Vitali theorem and conclude that (5.1) 
converges uniformly over G when >> | b, | diverges, whereas the sequences of 
even and odd approximants converge uniformly over G to separate limits 
when >> |b, | converges. The limits are in all cases analytic functions of ¢ 
over G. Of course, it is well known that when >> | b, | converges the sequences 
of even and odd approximants converge over the whole plane with the exception 
of isolated values of t, and the limit functions are meromorphic functions of 1/t. 

We shall state our result as 


THEOREM 5.1. Let 0 <h S 1 and let a2, a3, --- be any numbers lying within 
or upon the parabola 


(5.2) |z2| — R(z) = 4h’. 








368 J. FINDLAY PAYDON AND H. S. WALL 


Let G be any closed region entirely within the cardioid (1.5). Then the continued 
fraction (5.1) converges uniformly over G if the series > |b, | diverges, where 
bi = 1, Dag = 1/dnQngr (n = 1, 2, 3, ---), while the sequences of even and odd 
approximants converge uniformly over G to separate limits if >~ | b, | converges. 


It will be noted that when a, is in the strip —3h S y S +4h (z = x + iy), 
9. . - od 
then a, is in the parabola (5.2), and hence our theorem can be stated in terms of 
the continued fraction (1.4) as was done in §1. 


The continued fraction 1/1 + at/1 + at/1 + at/1 + --- in which a is any 
value upon the parabola (5.2) represents a function f(t) having a branch point 
at t = —1/4a, which is a point upon the eardioid (1.5). Hence our theorem 


does not hold if the cardioid is replaced by a curve enclosing points outside 
the cardioid. 

If a2 , a3, -- + are real and positive and G is an arbitrary bounded closed region 
at a positive distance from the negative half of the real axis, then, by choosing 
h > 0 sufficiently small, the cardioid (1.5) can be made large enough to contain 
G on its interior, and at the same time the a,’s will remain in the parabola (5.2). 
Hence the continued fraction converges uniformly over G if > b, diverges, and 
the sequences of even and odd approximants converge uniformly over @_ if 
>> b, converges. Thus we obtain the Stieltjes convergence theorem as a 
limiting case of Theorem 5.1. 

As a corollary to Theorem 5.1 we have at once the following theorem about 
the continued fraction (1.1) and the element region (, of §2. 


THEOREM 5.2. Ifc = r + is (4 <r < 1) and @, a;, --~ are in Qt. (ef. §2), 
then the continued fraction (1.1) converges if and only if the series >> | bn | diverges. 
In particular, (1.1) converges if the a,’s are in any bounded portion of Q. . 


If ais any point of the plane which is not in the interval [R(a) S —4, (a) = 0] 


c can be so chosen that @, contains a on the interior. Hence we have this 


theorem. 


Tueorem 5.3.° If a is not in the interval [R(a) < —1, ¥(a) = 0], then a domain 
D of arbitrarily large (finite) area can be found containing a on the interior such 
that (1.1) converges if the a,’s are in D. 


The following theorem of E. B. Van Vleck’ is easy to establish with the aid 
of the preceding ideas. 


TuHeroreM 5.4. If a, a2, a3, «++ ts @ sequence of numbers having a finite 
limit a # 0, and G is any bounded closed region containing no point of the ray 


8 Otto Szdsz established the existence of a circular domain D with center a such that 
(1.1) converges if the a,’s arein D. Cf. Perron, footnote 6, p. 282. 

*°E. B. Van Vleck, On the convergence of algebraic continued fractions whose coefficients 
have limiting values, Transactions of the American Mathematical Society, vol. 5(1904), 
pp. 253-262. Cf. Perron, footnote 6, p. 288. 








‘nued 
vhere 
l odd 


Jes. 


- ty), 
ns of 


any 
oint 
rem 
side 


gion 
sing 
tain 
5.2). 
and 








CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 369 
from —1/4a to « in the direction of the vector —1/4a, then there exists an N such 
that if n = N the continued fraction 


L Gait Antal 


it 2 + 2 +=: 


converges uniformly over the region G. 


(5.3) 


Proof. The region aG is a bounded closed region containing no point of 
the interval [R(a) S —3, X(a) = 0]. Let k > 1 and h > O be so chosen that 
aG will lie on the interior of the region bounded by the cardioid (1.5) together 
with the circle | z + 1/8k | = 1/8k. Then if tisin G so that at is in aG and N 
is sufficiently large, we shall have a,/a in the parabola (5.2), and at the same 
time | a,/a| < k,forn > N. It follows that ifn = N the approximants of the 
continued fraction 


1 (n+ 1/a)at (Ani2/a)at 


i+ 1 + T+: 


are uniformly bounded for ¢ in G. In fact, if at is in the circle | z + 1/8k| = 
1/8k, then | (a,/a)at| < k(1/4k) = 1/4 for n > N, so that the approximants 
are all in the circle | z — 1 | = 1; while if at is in aG but outside the circle | z + 
1/8k | = 1/8k, the approximants are in one or the other of two fixed circles, 
as in the proof of Theorem 5.1. Therefore the continued fraction (5.3) con- 
verges uniformly over G if n = N inasmuch as the series )> | b, | constructed 


for (5.3) is evidently divergent. 


6. A class of continued fractions with elements in the unit circle. In 1901, 
E. B. Van Vleck” proved a theorem which may be stated in the following 
form. 


THEOREM 6.1. Jf 91,92, °°: are real numbers such that 0 Sg, <1(n = 1) 
2, ---) and x, %2, --+ are independent complex variables, then the continued 
fraction 


1 9i T% (1 — Gr) G2 Xe (1 — J2)9s Xs 
i+ 1 + 1 cs 1 +++: 


converges uniformly for | x, | S 1 (n = 1, 2, ---) provided the series 


san . fige *** Yn 
(6.2) 1+ ; 
x (1 — gi)(1 — ge) «++ (1 — gn) 
converges. The sum of the series (6.2) is the value of the continued fraction for 
ty = 2 = +--+ = —1, and is the least upper bound of the absolute value of the 
continued fraction for |x,| S 1 (n = 1, 2, ---). 


(6.1) 


1” E. B. Van Vleck, On the convergence and character of the continued fraction a,z/1 + 
aoz/1 + a3z/1 + --+ , Transactions of the American Mathematical Society, vol. 2(1901), 
pp. 476-483. 








370 J. FINDLAY PAYDON AND H. S. WALL 


Closely related to this theorem of E. B. Van Vleck is the following theorem 
which includes the Pringsheim criteria.” 


THEeoreM 6.2. I[f0 Sg, < lor0<g, S 1(n = 1, 2, ---), then the continued 
fraction 


gi i= 91) 92 L2 (1 — 92) 93 Xs 


(6.3 = 

it 1 + 1 ++ 
converges uniformly for | x, | S 1(n = 2,3, ---), its value for x2 = x3 = --- = —1 
is 1 — (1/S) where S is the sum of the series (6.2), and the absolute value of the 


continued fraction does not exceed 1 — (1/8) for|xz,| S 1 (n = 2,3, ---). 


It is easy to see that Theorem 6.1 is a consequence of Theorem 6.2. For, 
if we multiply (6.3) by x, , add 1, and take the reciprocal of the resulting con- 
tinued fraction, we obtain (6.1); and the uniform convergence of (6.1) follows 
from the fact that 1 — (1/S) < 1 if (6.2) converges. 

Our object here is to give a simple proof of Theorem 6.1, to show, conversely, 
that Theorem 6.2 is a consequence of Theorem 6.1, and, finally, using the 
method of linear transformations, to show that (6.1) always converges for 


|an,| S1(n = 1,2, ---), except possibly for 7) = x2 = --- = —1. 

(i) Proof of Theorem 6.1. If m = a = --- = —1l andg, = t,/(1 + t,) 
(n = 1, 2, ---), then (6.1) is equivalent to the continued fraction 
(6.4) : .& f , 

Com es Ga eS Go 

whose n-th numerator and denominator are G, = 1+ 4 + tt + --- + 
tit, «++ tp, and H, = 1, respectively. Thus we see that (6.4) is equivalent 
to the series 1 + p BP tit. --- t, , which is the same as the series (6.2). Hence 
it follows that (6.1) converges for 7; = x2 = --- = —1 if and only if the series 


(6.2) converges, and the series and continued fraction are equal when the z’s 
have this value. 

We shall show next that if | z, | S 1 (n = 1, 2, ---), then (6.1) can be written 
in the form 


(6.5) ae. ; 
t=-i +h ~ 2+ = 
where 
(6.6) lra| S tn (n = 1, 2, ---); 
and since (6.5) is equivalent to the series 1 + >» rife -+* T,, the truth of the 


theorem will be evident. 


11 Cf. Perron, footnote 6, pp. 254-264. Cf. also Scott and Wall, footnote 4, (1), pp. 158- 
160. Following Perron, Scott and Wall erroneously ascribe Theorem 3.2, p. 159, to Van 
Vleck. 








orem 


nued 


+); 
the 


158- 
Van 








CONTINUED FRACTION AS SEQUENCE OF LINEAR TRANSFORMATIONS 371 


Let 71, 72, --- be defined recursively by the formula 


_ —Knyi2n(1 + Tas) 
1+ Raya ta(1 + r-) 
where ro = 0, and kn; = ta/(1 + tas) (1 + tr) (n = 1,2, +--+ 5 = 0). Then, 
|r: | = | kear/(1 + hear) | S ke/(1 — ke) = th , and, by mathematical induction, 
if | roa] S tra then |r. | = | kngitn(1 + ra-a)/{1 + Rapitn(1 + road} | S 
knai(l + taar)/{1 — Knai(1 + t-1)} = t,, so that (6.6) holds. Now if B, is 
the n-th denominator of (6.1) then we have: 


(6.7) Fe (n = 1, 2, ---), 


a Kenss Zn Bu 


6.8 _o.. 
(6.8) Bass 


For, By) = B, = 1, Be = 1 + kor, ¥ 0, and the formula is evidently true when 
n= 1. Assuming that (6.8) holds for n < p, it follows at once from (6.7) and 
(6.6) that (6.8) holds for n = p + 1, and hence for all m. On substituting 
the values of the r’s from (6.8) into (6.5) the latter may be readily transformed 
into (6.1), and the proof is complete. 

(ii) Under either of the hypotheses of Theorem 6.2, the continued fraction (6.3) 
can be written in the form 


and Bays #0 (n = 1, 2, ---). 


(1 hut, (1 —-Aidhows (1 — ha)ha xe \ 
17 4 1 + 1 + i tee f? 


where 0 S hn, < 1 (n = 1, 2, ---); and if gi > O the series 


(6.9) 





. hy he +++ hn 
6.10 1 ——_— 
am + 2am ae 
1 — (1/8) ; , 
converges to the sum “7 , where S is the sum of the series (6.2) (possibly ~ ). 
1 


Thus, Theorem 6.2 is a consequence of Theorem 6.1. 
We shall suppose g; > 0 since the theorem is trivial if g; = 0. Put x. = x3; = 
-» = —1 in (6.3), and denote the n-th numerator and denominator of the 
resulting continued fraction by P, and Q, , respectively. Then one may verify 
by mathematical induction that P, and Q, are polynomials in g; , g2, --* , Jn 
given by the formulas P, = (1 — gi)(1 — ge) --- (1 — ga)(S.2 — 1), Q, = 
(1 — gi)(1 — ge) --- (1 — gn)S,, where S, is the sum of the first n + 1 terms 
of the series (6.2). If 8, = (1 — gn)gn+iQn+/Qnii, then s, 2 0 and the series 
1 + >> sis2 --- 8, converges, its sum being equal to Ave aa hn = 8,/(1 + 8n) 
1 
satisfies the inequality 0 < h, < 1, and the series (6.10) converges to the sum 


; — @/5) Moreover, one may verify at once that h; = (1 — gi)ge, (1 — 


gi 
hn)Anga = (1 — Ggnit)Gna2 (n = 1, 2, ---), and therefore (6.3) and (6.9) are 
equal, as was to be proved. 





372 J. FINDLAY PAYDON AND H. 8. WALL 


(iti) If 0 < g, < 1 (nm = 1, 2, ---) and the series (6.2) diverges, then the con- 


tinued fraction (6.1) converges for |z,| S 1 (n = 1, 2, --+), except when x, = 
o=ees = —], 

Proof. If w is the value of (6.3), then (6.1) diverges if (and only if) v = 
aw = —1. Since!2,! < 1,!w| ¥¢ 1, this implies that | 2,| = 1,|w| = 1. 


Now we may write: w = g;/[1 + (1 — gi)u], where 


ia ia go (1 — ge)g3 2s (1 — gs)gs%s | 
1= 42) , 

1+ l + l ae 
so that |v! < 1. Consequently, we see that |w — 4] S } and therefore 
w= 1,2, = —1, = —1. Repeating this argument starting with v; instead 
of with v, we find that rv. = —1, and then, by mathematical induction, z3; = 
m= es = =], 


NORTHWESTERN UNIVERSITY. 














SOME PROPERTIES OF SUMMABILITY 
By J. D. Hiti 


Introduction. The purpose of this note is to present a few general results 
pertaining to some well-known and desirable properties of summability. By a 
method of summability, we shall ordinarily mean the familiar matrix method of 
assigning a limit to a sequence, although some of the broader remarks apply to 
any method of summability. The principal results are derived for the class of 
reversible’ matrix methods, namely those for which the system of equations 


> Ant 8: = te (m = 1, 2,3, --+) 


k= 


— 


has a unique solution {s,} corresponding to each convergent sequence {t»}- 
The complex number system is employed throughout except where the contrary 


is specified. 


1. Translative methods. We shall say that a method of summability is 
translative to the right (to the left) if the summability of the sequence 8; , s2, 83, --- 
to the limit s always implies the summability of the sequence 82, 83, 8, °° 
(so, 81, %&,°-*: , for arbitrary so) to the limit s. A method that is both trans- 
lative to the right and to the left will be called simply translative. We shall 
concern ourselves principally with methods of this latter type. It is immedi- 
ately obvious that the summability of s; , s2 , 83 , -- - to s by means of a translative 
method implies the summability Of 8m4i, Sm42, 8m43, °** tO 8, where m may be 
any positive or negative integer and s, is to be interpreted as arbitrary if k < 0. 
Furthermore, if the sequence of partial sums of the series u; + ue + Us + --: 
is summable to s by means of a linear and regular translative method, then 


the sequences of partial sums of the series we + us + uy +:°-: and 
Up + Uy + Us + --~ (up arbitrary) are summable to s — u; and s + wo, respec- 
tively. Conversely, if the summability of the series uw; + u2 + us + -:- tos 


by means of a linear regular method A always implies the simultaneous sum- 
mability of the series we + us + uy +--+ and vw + uy + w+ --: tos — wy 
and s + uo, respectively, it follows that A must be translative. 

The property described here as translativity has been mentioned by several 
writers’ as a desirable adjunct to a method of summability, although the known 


Received December 4, 1941. 

1See S. Banach, Théorie des Opérations Linéaires, Warsaw, 1932, p. 90. The term re- 
versible has been used by I. Schur (Uber die Aquivalenz der Cesdroschen und Hélderschen 
Mittelwerte, Mathematische Annalen, vol. 74(1913), pp. 447-458) in a sense different from 
that of Banach. 

2 The reader will find references easy to locate in L. L. Smail, History and Synopsis 
of the Theory of Summable Infinite Processes, University of Oregon Press, 1925. 


373 











374 J. D. HILL 


results bear largely on special methods. Hardy,’ for instance, has shown that 
Borel summability is translative to the left but not to the right. Knopp,‘ 
on the other hand, has established the fact that Euler summability is translative. 
Moreover, it is a trivial matter to verify directly that both Abel summability 
and regular Nérlund summability (of which Cesaro summability is a special 
case) are likewise translative methods. Since any method equivalent to a trans- 
lative method is itself translative, the translativity of Hélder means follows from 
that of the Cesaro. Garabedian and Randels’ have obtained a necessary and 
sufficient condition in order that regular Riesz means with positive weights shall 
be translative to the right. We shall show later (see Theorem 2) that this 
condition may be extended to complex weights, and in addition we obtain the 
condition for translativity to the left. 

An approach to the general problem of translativity has been made by Hurwitz 
and Silverman,’ and later by Carmichael.’ The former obtain sufficient condi- 
tions in order that an analytically regular transformation shall be translative, 
and they also show that analytically regular transformations can be constructed 
which are translative neither to the left nor to the right. Carmichael goes a 
step further and states necessary and sufficient conditions in order that normal® 
and regular matrix summability shall be translative. We propose here to obtain 
necessary and sufficient conditions in order that the more general reversible 
matrix summability shall be translative. The conditions of Carmichael will 
appear as a special case, although in a more explicit form. 

According to the definition stated above, the method of summability A defined 
by the matrix (a,x) will be called translative if, for arbitrary s) , the condition 


\ 


(1.01) lim >> ans 8% exists and equals s 


m k=l 


always implies the coexistence of 


fo) io) 
lim >> dmx 8x1 and lim >> ame Se41 
k=1 


m k=l m 


and their equality with s. As a consequence of this definition, we have at once 
. 9 
the following lemma, whose proof we leave to the reader. 


3G. H. Hardy, Researches in the theory of divergent series and divergent integrals, 
Quarterly Journal of Mathematics, vol. 35(1904), pp. 22-66. 

‘K. Knopp, Uber das Eulersche Summierungsverfahren, Mathematische Zeitschrift, 
vol. 15(1922), pp. 226-253. 

5H. L. Garabedian and W. C. Randels, Theorems on Riesz means, Duke Mathematical 
Journal, vol. 4(1938), pp. 529-533. See Theorem 4, p. 532. 

6 W. A. Hurwitz and L. I. Silverman, On the consistency and equivalence of certain defini- 
tions of summability, Transactions of the American Mathematical Society, vol. 18(1917), 
pp. 1-20. 

7R. D. Carmichael, General aspects of the theory of summable series, Bulletin of the 
American Mathematical Society, vol. 25(1918), pp. 97-131. 

8 In order that a triangular matrix (a,x) be reversible, it is necessary and sufficient that 
Gmm be different from 0 for all m; such a matrix is called normal. 

® Lemmas | and 2 were developed in collaboration with Prof. H. J. Hamilton. 








on 


ice 


the 


hat 








SOME PROPERTIES OF SUMMABILITY 375 


LemMA 1. In order that A be translative, it is necessary and sufficient that 


(1.02) lim Qnz = 0 (k = 1, 2,3, ---) 


m 


and that the condition (1.01) always imply the coexistence of" 


oe i) 
lim po Om,k—1 Sk and lim 7 Am,k+1 Sk 
™ k=1 k= 


vk m 
and their equality with s. 

Let us denote by Ag and A, the methods corresponding to the matrices 
(Am.4-1) ANd (Am441), respectively. We may then express Lemma 1 in the fol- 
lowing equivalent form. 

LemMa 2. In order that A be translative, it is necessary and sufficient that condi- 
tion (1.02) hold and that the methods Ay and A, be not weaker than A and con- 
sistent with A. 

We may pause at this point to observe that if A is normal then the methods 
A, and A, will be not weaker than A and consistent with A if and only if the 
methods ApgA~' and A,A™‘ are regular. The latter are essentially the conditions 
of Carmichael mentioned above. 

In view of Lemma 2, the problem of characterizing translativity leads at once 
to the more general problem of determining necessary and sufficient conditions 
on a method B in order that it be not weaker than, and consistent with, a given 
method A. For the case in which A is assumed to be normal, a solution of the 
latter problem has been given by Mazur." We proceed now to show that it is 
possible, by slight modifications of Mazur’s proof, to obtain a similar result for 
the case in which A is assumed to be merely reversible. We recall that the 
method A corresponding to the matrix (a,,,) of complex numbers is said to be 
reversible if the system of equations 


(1.03) >> Ak 8 = tm (m = 1, 2,3, ---) 


has a unique solution {s,} corresponding to each sequence {t,,} in the space (c) 
of complex convergent sequences. In particular, if {t,} is allowed to become 
Y = {o,} or Y, = {62} forn = 1, 2,3, --- , we denote the corresponding solu- 
tions {s,} by {&} or {&}, respectively. 

The main point in the proof of Mazur may be regarded as that of finding 
explivit expressions for the s, in terms of the ¢,, and the fundamental solutions 
{¢.'| and {&}. In the event that A is normal, such expressions may be obtained 
very simply by means of Cramer’s rule. In the present case we resort to the 
complex analogue of a theorem of Banach (see footnote 1, p. 47, Théoréme 10), 
which, with regard to the application in view, may be stated as follows. 


10 We define ano as 0 (m = 1, 2,3, --- ). 
1 §. Mazur, Uber lineare Limitierungsverfahren, Mathematische Zeitschrift, vol. 28(1928), 
pp. 599-611; in particular, see Theorems III, IV, and V. We assume that the reader is 


familiar with this paper. 








376 J. D. HILL 


TueoreM A. If the method A is reversible, the solution of the system (1.03) may 


be expressed in the form 


i =) 
(1.04) Sky) _ C;. lim tn + ym Coste (k => 1, 2, 3, ee -) 
m m=1 
in which the coefficients C;, and Cy»; are independent of y = {tm} and satisfy the 
condition 7 Cur| < © (k = 1, 2,3,---). 
m=1 


The proof of this theorem follows exactly the lines of the original as far as the 
final remark, which states that the s,(y) are linear functionals in the space of 
the sequences y = {t,}. In the present case, it is easily seen that the s;,(y) are 
additive and continuous operations on the space (c) to the space of complex 
numbers that satisfy the condition s,(zy) = z-s,(y) for every complex number z. 
The derivation of the general form of such an operation is formally identical 
with the derivation of the general linear functional in the space of real con- 
vergent sequences (see Banach, footnote 1, p. 65, §3). It is clear then that 
s.(y) may be written in the form (1.04), and since we have |||! = 

20 
1Ce!| + D0 | Cre |, the final remark in the theorem is verified. 


m=1 


It remains to express the coefficients C; and C,,;, in terms of the fundamental 
solutions {&} and {&} defined above. We have & = s,.(Y,) = Cy, and conse- 


2) 
+ , Y m . +0 
quently & = s.(Y) = C, + > &. If we now define £; and & by means of the 
m=1 


m 


2) 
relations & = C, = & — >> & and t = lim ¢,,, then (1.04) may be written 
1 


m= 


in the form 


(1.05) si(y) = >> Ef tm (k = 1,2,3, --). 


m=() 

On the basis of equations (1.05), it is now possible, with no essential modifica- 
tions, to follow the lines of Mazur’s proof and thus arrive at the following 
theorem. 

TueoreM B. Let A with matrix (an) be a given reversible method, and let B 
with matrix (bmx) be any method whatsoever. Then in order that B be not weaker 
than A and consistent with A the following conditions are necessary and sufficient. 


(1.06) B-lim {&} exists and = 1; 
(1.07) B-lim {& } exists and = 0 (n = 1, 2, 3, ---); 
(1.08) sup >, | >> bate | < & (m = 1, 2,3, ---); 
r n=1 | k=l | 
(1.09) sup >, | >> bate | < @. 
m n=l | k=l 





nay 








SOME PROPERTIES OF SUMMABILITY 377 


Returning now to Lemma 2 and taking into account the result of Theorem B, 
we may state the following characterization of reversible translative methods. 

THEOREM 1. In order that a reversible method A be translative, it is necessary 
and sufficient that condition (1.02) hold and that the methods Az and A, satisfy the 
conditions (1.06)—(1.09). 

As an application of Theorem 1, we take A as the normal and regular method 
of Riesz means. The latter in its most general form is defined by a triangular 
matrix whose elements are given by @nx = pr/Pm (k = 1, 2, 3,-+-,m; 
m = 1, 2,3, --- ), where the complex numbers p; and P, = pi + po + +++ + Pm 
(k, m = 1, 2,3, --- ) are all different from 0 and satisfy the Silverman-Toeplitz 
regularity conditions. We notice that condition (1.02) is implied by the 
regularity. 

One may readily verify in this case that the fundamental solutions are given 
by the following equations. 


(k = 1, 2,3, --- ); 


wt 
tod 
Il 
— 


(1.10) iam 
= P, ; ga = fe. & = Ootherwise (k,n = 1, 2,3, ---). 


Pn Pn+i 
Furthermore, since the regularity of A implies that of Ax and A_, it is clear 
from (1.10) that both A, and A, satisfy the conditions (1.06) and (1.07). 
We consider next the conditions (1.08) and (1.09) as applied to Az and A,. 
By employing (1.10) to evaluate the summations involved, we find without 
difficulty the following expressions, wherein po is understood to be 0. 


(1.11) G..= x > Om, k— ig; ani l P,, | -atp >a Pn- 1€; + Dnén+1| + | pris i 


is 


(1.12) Gg. = > Om, kth me Pnsitn + Pnso ent } + | Drsi ky } ’ 


n=l 
= = n— n y 1| 
(1.13) Hn = m,r—iét | = Poms Be | | Pek aiti. 
) Xu > , us | E | 2 IP Pn Pn+i Pm+1 Po 
20 20 . P 
1.14 Ht = = -~ | P,| Pn+i _ Pn+2 r Pm m—1 
( n=l 2d ma bh = ip. | I Pn Pn+1 Pm- yh 


Since a,x in this case is 0 for k > m, we see from (1.11) and (1.12) that for each 
fixed-m = 1, 2, 3, --- 


Gt, = Ge.ntt (r =m+2,m+3,m+4,---), 
Gh, = 7 (r = m, m + 1, m + 2,---), 


from which it follows that Ag and A, satisfy the condition (1.08). 

With the remaining condition, the state of affairs is complicated by the fact 
that regularity alone is not sufficient to ensure the satisfaction of (1.09). Con- 
sider the following examples. ‘ 








378 J. D. HILL 
Example 1. Let psn. = 1/n, Pant = 1/n!, por = 1/(n + 1)! for n = 1, 
2, 3,--:. The corresponding method (R, p,) is normal and regular, but from 


(1.13) we have 


2 
i;..; > Pan—1 Pan = (n+ 1) Pan >n+1. 


Dan Pin Pin-1 
Example 2. Let psn-2 = (n + 1)!, psn = n!, par = (n + 1)! for n = 1, 
2, 3,--:. The method (R, p;,) so defined is normal and regular, but (1.14) 


gives 


; an Pn Pan n+1 
Bt, > ee a p-(a- ) :. 
. Psn—1 P3n z ° Pp a 2 


since P;, > 2(n + 1)!. 

In view of the preceding examples, we see that Theorem 1 in the present in- 
stance reduces to the following form. 

THEeoreM 2. In order that a normal and regular method of Riesz means be 
translative it is necessary and sufficient that the sequences {Hx} and {Hj} be 
bounded. 

This theorem improves the result of Garabedian and Randels to the extent 
previously mentioned. These writers (see footnote 5, p. 532) have also observed 
that for regular Riesz means with positive weights the monotonicity in either 
sense of the sequence |p,/Pn41} is sufficient for translativity to the right. It 
is to be noticed that in the same circumstances this condition is likewise suffi- 
cient for translativity to the left. 

The method (R, p,) defined above in Example 1 may be used to illustrate the 
curious behavior of non-translative methods. For, let s; , 82 , 83, --- denote the 
sequence 0, 0!, 0, 0, 1!, 0, --- ,0, (k — 1)!, 0,---. One easily shows that this 
sequence is summable-(R, p;,) to 1, whereas the sequence % , s: , 8, *** is sum- 
mable-(R, p,) to 0, and the sequence s2, 83, s,+-- is such that its (R, px)- 
transform diverges to +. On the other hand, let i, , #3, --- denote the 
sequence 1, 0, 0, 1, 0, 0, --- , 1, 0,0, ---. This sequence is summable-(R, p,) 
to 1, while each of the sequences & , 4, , &, +--+ and &, ts, 4, --- is summable- 
(R, pe) to 0. 

Finally, we may point out that the notion of translativity enters naturally 
if we attempt to generalize the fact that the terms of a convergent series form a 
null sequence. If we ask to what extent this property is preserved when con- 
vergence is replaced by summability, we find an answer in the following theorem. 


Turorem 3. In order that the summability of the series >> ux by a linear 
1 
regular method A shall always imply the A-summability of the sequence {ux} to 
0, it is necessary and sufficient that A be translative to the left. 
The proof follows at once from the identity (uw, we, Us,°*:) = 
(s, 82, 83, see ) -™ (So , 8, 8, vee) + (8, 0, 0, --- ) where s 
Uy + U2 + -++ + ue (k = 1) and &% is arbitrary. 


ll 





sd. 
rom 


ll 








SOME PROPERTIES OF SUMMABILITY 379 


2. Methods stronger than convergence. In order that a regular method of 
summability A should constitute a non-redundant generalization of ordinary 
convergence, it is necessary that at least one divergent sequence should be 
summable-A. If this condition is satisfied, we shall say that A is stronger than 
convergence. When faced with the problem of establishing this property for a 
given method A, one naturally attempts to construct a divergent sequence that 
is summable-A. The success of this process in practice is well known. On the 
other hand, it may be of interest to observe that it is possible to state a charac- 
terization of the reversible regular methods for which the preceding property 
holds. For, ‘let us consider the reversible regular transformation 


(2.1) Dd. ame 8: = tm (m = 1, 2, 3, e+e) 
k=l 


and its inverse transformation 


> eo +++) 


M 
> 
I 


(2.2) Sk (k 


m=0 


as found in §1. It is evident that A will be stronger than convergence if and 
only if the transformation (2.2), regarded as a transformation of the space (c), 
fails to be convergence-preserving. Now a transformation of the type (2.2) 
differs from one of the type (2.1) in that the former involves & , the limit of the 
sequence {t,,}. However, it is easy to see that the familiar necessary and suffi- 
cient conditions of Schur for preservation of convergence in (2.1) require but a 
minor modification in order to hold for (2.2). We find therefore that (2.2) will 
be convergence-preserving if and only if the following conditions are satisfied. 


(2.3) sup >, |& |< =; 
k m=0 
(2.4) lim >> exists; 
k m=0 
(2.5) lim & exists (m = 1, 2,3, ---). 
k 


From these remarks, we conclude the truth of the following theorem. 


THeorEM 4. In order that a reversible regular method be stronger than con- 
vergence, it is necessary and sufficient that at least one of the conditions (2.3), (2.4), 
(2.5) be violated. 

In applying this theorem to normal regular Riesz means (see §1) we find that 
(2.4) and (2.5) are always satisfied. Condition (2.3) must therefore fail to hold, 
and this reduces to the condition 


- | Paal + | P| 
| De | 
If p, > 0 for all k, condition (2.6) becomes simply lim Pi./p. = +. 
k 


(2.6) lin = +o, 
k 








380 J. D. HILL 


3. Methods of type 1. The method of summability A defined by the 
matrix (@,,) is said to be of” type M if the conditions (see Banach, footnote 1) 


(3.1) D> |an| < 20 , D> aman = 0 (qk = 1, 2,3, -->) 
m=1 m=1 

always imply 

(3.2) am = 0 (m = 1, 2,3, --- ). 


The significance of this property is shown by the following theorems. 


Tr 3 " 

[TuHEoREM oF Mazur.” In order that a normal regular method A be cousistent 
with every regular method not weaker than A, it is necessary and sufficient that A 
be of type M. 


TueoreM oF Banacn.” Jn order that a reversible regular method A be consistent 
with every regular method not weaker than A, it is sufficient that A be of type M. 


Banach (see footnote 1, p. 236, lines 33-36) remarks without proof that the 
type M condition in the latter theorem is necessary as well as sufficient. Since 
no proof of this fact seems to have appeared in the literature the following one 
may be of interest. 

TureoreM 5. In order that a reversible regular method A be consistent with 
every regular method not weaker than A, it is necessary and sufficient that A be of 
type M. 

Proof of necessity. We assume that A is a reversible regular method con- 
sistent with every regular method not weaker than A and that {a,,} is an arbi- 
trary sequence satisfying the conditions (3.1). We have to show that condi- 


tions (3.2) necessarily follow. With this in view, let bar = >. am@me (k, n = 


m=1 
1, 2, 3, --- ) and let B denote the method defined by the matrix (b,x). If {sx} 
is an arbitrary A-summable sequence, we set >. GmiSe = tm (m = 1, 2,3, --- ), 
k=1 


so that the sequence {/,} is convergent and therefore bounded. It follows 
easily then that 


(3.3) lim >> bar se = lim >> amtm = > amtm- 
n k=l m=1 


n m=1 


12 See J. D. Hill, On perfect methods of summability, Duke Mathematical Journal, vol. 
3(1937), pp. 702-714. This paper is devoted to methods of type M that are regular and 
reversible; such methods are called perfect. 

13§. Mazur, Eine Anwendung der Theorie der Operationen bei der Untersuchung der Toe- 
plitzschen Limitierungsverfahren, Studia Mathematica, vol. 2(1930), p. 48, Satz 7. 

14 See footnote 1, p. 95, Théoréme 12. The proof is given for the real domain but it may 
be easily extended to the complex. 








‘stent 
at A 


stent 
e M. 
t the 
since 
one 


with 


be of 


con- 
arbi- 
ondi- 


n= 
{ si} 
-), 


llows 


, vol. 
r and 


- Toe- 


t may 








SOME PROPERTIES OF SUMMABILITY 381 


This relation shows that every A-summable sequence is summable-B to the value 


mlm. Furthermore, if {s,} is convergent, we make use of (3.1) and the 


Ms 


1 


3 
i] 


re) 
fact that the double series }> amd@nxs, converges absolutely to show that 
kym=1 


m=1 m=1 k=1 k=1 m=1 


io) oo co] 0 oO 
ys Antm = bs > Qm Ank 8k = Zz (> Qm ant) & = 0. 


Consequently, every convergent sequence is summable-B to 0. 

It is now apparent that the method C defined by the matrix (an. + Dmx) is 
regular and not weaker than A. Moreover, since A is reversible, we infer the 
existence of a sequence {o;} satisfying the equations 


wo 
>> dmk oe = Sm (m = 1, 2, 3, s+), 
k=1 

where &,, is the complex conjugate of a,,. Since lim a, = 0, the sequence 


m 
{ox} is summable-A to 0; and in view of (3.3), it is summable-C to the value 
2 
' 2 y ° ° . ° 
> | am |’. If A and C are consistent, the latter expression must vanish, and this 
m=1 


completes the proof. 


MICHIGAN STaTE COLLEGE. 








A GENERAL EQUATION FOR RELAXATION OSCILLATIONS 
By NoRMAN LEVINSON AND OLIVER K. SmitH 


1. Introduction. The importance of relaxation oscillations in physical and 
engineering problems was shown by Van der Pol,’ who also treated by graphical 
methods a particular equation describing these oscillations. The origin of relaxa- 
tion oscillations can be described qualitatively by considering the following two 
equations with constant coefficients: 


(1.1) é + 2a¢ + br = 0, 
(1.2) ¢ — 2at + br = 0, 


where a > 0, b > a’, and the differentiations are with respect to ¢t. If we denote 
b — a by w then the solution of (1.1) is Ae~ sin (wt + a) and the solution of 
(1.2) is Ae“ sin (wt + a), where A and aare arbitrary constants. In an electrical 
circuit, for example, described by (1.1), the term 2az arises from a withdrawal 
of energy from the system. Since there is no energy being put into the system, 
this withdrawal is uncompensated and results in a gradual dissipation of the 
initial energy of the system. Thus the solution of (1.1) tends to zero. In (1.2), 
on the other hand, the only term affecting the energy is —2az which arises from 
adding energy to the system. Thus the solution of (1.2) describes oscillations 
of every increasing amplitude. 

The equation (1.1) may be said to describe a system with positive damping 
and (1.2), a system with negative damping. Positive damping decreases the 
energy and therefore the amplitude of an oscillation while negative damping in- 
creases it. Relaxation oscillations arise when both positive and negative damp- 
ing occur in a system. More precisely what occurs is that for small displace- 
ments, that is, when z is small, the system has negative damping which causes 
oscillations of increasing amplitude; on the other hand, for large displacements 
the system has positive damping which tends to decrease the amplitude of 
oscillation. Clearly, then, the steady state amplitude of oscillation 
of the system cannot be too small, for then damping would always be 
negative and the oscillation would increase in amplitude, that is, would not be 
steady-state. In the same way a very large amplitude is not possible. Thus 
qualitatively one would expect a steady-state oscillation of such amplitude that 
during each period the energy lost when the displacement was large (and damp- 
ing positive) would be exactly compensated by the energy gained when the 
displacement was small (and damping negative). 

All this is what one would expect qualitatively. Actually, under a wide range 
of conditions precisely this does happen. On the other hand, there are also 


Received December 10, 1941. 
1B. Van der Pol, Relazxation-Oscillations, Philosophical Magazine, seventh series, vol. 
2(1926), p. 978. 
382 





and 
hical 
axa- 
two 


note 
n of 
rical 
uwal 
tem, 

the 
1.2), 
rom 
ions 


ping 
the 
y in- 
mp- 
lace- 
uses 
ents 
e of 
tion 
; be 
t be 
Thus 
that 
mp- 
the 


ange 
also 


vol. 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 383 


equations satisfying our qualitative description which do not have any steady- 
state oscillating solution at all. In any case a mere qualitative discussion is 
entirely inconclusive. Von Karman’ has, in fact, stated that the question of 
existence of solutions for non-linear problems is of considerable practical interest 
since the existence of such solutions is by no means obvious. 

The equation considered in detail by Van der Pol was 


#€— pl —2)e¢+2=0. 


With u > 0, it is clear that for | x | < 1 the damping is negative and for | z | > 1, 
positive. From graphical considerations Van der Pol obtained the solutions for 
various values of u and in each case the result was a rapid approach to a steady- 
state oscillation. 

A more general description of relaxation oscillations is given by the equation 


(1.3) #€+f(xzjté+2z=0, 
where f(x) is negative for small values of | x| and positive for large values of 
|x|. A ease of particular importance is where f(x) is an even function of x. 


. . . ° ° ° 3 
For this case the first satisfactory treatment of (1.3) is due to Liénard. 
Liénard introduces 


F(x) = [ f(x) dz. 
0 


Since f(x) is even, F(x) is an odd function. Liénard shows that if there exists 
some value of x, x, such that, for 0 < x < a, F(x) is negative while, for 
x > x, F(x) is positive and steadily increasing with F(x) —~ © asx — ~, 
then all solutions of (1.3) tend toward a steady oscillatory solution which is 
unique except for its phase. (This latter is to be expected since (1.3) is un- 
changed by a translation of the time scale, and thus translations of a solution are 
also solutions.) Liénard’s work represents the furthest progress in this problem 
to date. 

Here we propose to consider the generalized equation for relaxation oscil- 
lations 


(1.4) & + f(x, 4)é + g(x) = 0. 


g(x) is positive when x > 0 and negative when x < 0. f(x, #) is the damping 
coefficient which for large | x | is positive and for small | # | and | 2 | is negative. 
With little more than these requirements we shall show that (1.4) possesses 
periodic solutions and is therefore a generalized equation for relaxation oscil- 
lations. The question of the existence of a unique periodic solution is more 


involved and can be settled only with further restrictions. This problem we 


2 T. Von Karman, The Engineer Grapples with Nonlinear Problems, Bulletin of the Ameri- 
can Mathematical Society, vol. 46(1940), p. 617. 

3A. Liénard, Etude des oscillations entretenues, Revue Gén. de 1’Electricité, t. 
XXIII(1928), pp. 901-946. 





384 NORMAN LEVINSON AND OLIVER K. SMITH 


shall also consider. As a particular case of this consideration the existence of a 
unique solution for 


(1.5) €+f(x)t + g(x) = 0 


will be demonstrated when f(x) is not necessarily even nor g(x) necessarily odd. 

Using the method of Liénard we shall also demonstrate the existence of a 
unique steady-state oscillating solution for (1.5) under precisely the same condi- 
tions on f(x) as was given above in the statement of Liénard’s result for (1.3). 


2. On the existence of periodic solutions. Here we shall show that the 
equation 


(2.0) & + f(x, £)¢ + g(x) = 0 


possesses periodic solutions and thereby justifies our calling it a generalized equa- 
tion for relaxation oscillations. The equation (2.0) can be written as a pair of 
first order equations. 
(2.1) e . v, © —f(x, vv — g(x). 
dt 

We shall assume throughout that the derivative of g(x) and the first order partial 
derivatives with respect to x and v of f(z, v) exist and are continuous. Since dt 
can be eliminated from the pair of equations (2.1) resulting in a first order equa- 
tion in zx and », it follows that to solutions of (2.0) correspond curves in the 
(x, v)-plane which are solutions of the first order equation in x and v. In par- 
ticular, to a periodic solution of (2.0) there must correspond a closed curve in the 
(x, v)-plane since x and v return to their initial values in the course of the period 
of such a solution. Conversely, any solution of (2.1), which, when considered 
as a curve in the (z, v)-plane with parameter ¢, is a closed curve traversed with a 
finite change in ¢, corresponds to a periodic solution of (2.0). Thus the problem 
of finding periodic solutions of (2.0) is reduced to the problem of showing that 
there are solutions (2.1) forming closed curves in the (z, v)-plane, which are 
traversed with a finite change in the parameter ¢. 

We now state our theorem in precise terms. 


TueoreM I. Let xg(x) > O for |x| > 0. Moreover let 


too 
(2.2) I g(x) dz = @. 


IV 


Let f(0, 0) < 0 and let there exist some x) > 0 such that f(x, v) = Ofor|x| = xm. 


Further, let there exist an M such that for |x| S 2% 
(2.3) f(z, v) 2 —M. 
Finally, let there exist some x, > 2» such that* 


4 In (2.4) it would of course be equally good to have the integration from (—z; , —2o) 
and then to take v as negative. 








a ae. 


Saw om = © ©. &@© .3<4 


wa & 


e of a 


odd. 
of a 
ondi- 


(1.3). 


t the 


2qua- 
uir of 


urtial 
ce dt 
qua- 
1 the 
par- 
n the 
eriod 
lered 
ith a 
blem 
that 
1 are 


S Bes 


—Zo) 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 385 


z1 
(2.4) / f(x, v) dx = 10Mx, 


where v > 0 is an arbitrary decreasing positive function of x in the integration in 
(2.4). Under these conditions (2.0) has at least one periodic solution. 


In the proof of this theorem we make considerable use of the function 
z 
G(x) = [ g(a) ar. 
Jo 


Since xg(x) > 0, it follows that G(x) > 0 for x # 0 and that G(x) decreases 
monotonically as x increases from —% to 0 and increases monotonically as x 
increases from 0 to «. From (2.2) it follows further that lim G(x) (| | — «) 
is ©. This last result can be replaced by a weaker requirement which states 
merely that G(x) gets large for large x. Thus instead of (2.2), we first require 
that, for some 2; > 2%, 


G? na) 


(2.5) G(x:) — G(a) = max { 400M’ 23, —.-. 
M Xo 


Since G(x) is an increasing function for positive x, it follows that if (2.5) is 
satisfied it continues to be satisfied if we increase x,. Thus 2x, in (2.5) can be 
chosen so as to be at least as big as 2, in (2.4). Having so chosen x, , the only 
requirement beyond (2.5) needed to replace (2.2) is that 
(2.6) lim G(x) = 2G(a). 
|z| +00 

Since (2.4) continues to be satisfied if x, is increased by virtue of the fact that 
f(x, v) 2 Ofor x > x, it follows finally that we can, if necessary, increase 2; in 
(2.4) so that 2; in (2.4), (2.5) and (2.6) are all equal. 

Before proceeding to the proof of Theorem I we shall consider the general pair 
of equations 


— a.» 
(2.7) -* X(z, y), 


dy _ 


dt Y(z, y), 


where X and Y possess continuous first order partial derivatives in x and y. 
The equations (2.7) include as a particular case the equations (2.1) and therefore 
are of interest to us here. The equations (2.7) have been treated in considerable 
detail and we shall merely state some of their properties. Those points (x, y) 
at which both X and Y vanish are known as singular points. Through every 
point in the (z, y)-plane with the possible exception of singular points there 
passes one and only one solution of (2.7), this solution being given in terms of the 
parameter ¢. Such a solution forms a part of an integral curve. An integral 
curve cannot cross itself or another integral curve except at a singular point. 
Moreover, the change in ¢ in going between two points on an integral curve is 
always finite unless either of these points or some intermediate point on the 
curve is a singular point. Conversely, if an integral curve runs into a singular 
point the change in ¢ as the point is approached along the curve tends to infinity. 





386 NORMAN LEVINSON AND OLIVER K. SMITH 


Now let us consider an integral curve which for t — © remains in a finite 
region FR in the (x, y)-plane and let R be free of singular points. Then, as 
t— «, the length of the curve must tend to infinity for if the curve is finite it 
terminates in a point as t — «, which is impossible in a region free of singular 
points. 

We have then an infinite curve in a finite region which never intersects itself. 
Sketching a few such curves will make very plausible indeed the following 
theorem. 

TureoreM A.’ If an integral curve of (2.7) lies in a finite region R fort > « 
and if there are no singular points in R, then the integral curve is either a closed 
curve or else it approaches nearer and nearer to a closed integral curve. 

Thus returning to (2.1) we have a general method’ for demonstrating the 
existence of closed integral curves in the (z, v)-plane or, in other words, periodic 
solutions of (2.0). We now proceed to demonstrate the existence of a region in 
the (x, v)-plane satisfying the requirements of Theorem A, thereby proving 
Theorem I with the less restrictive (2.5) and (2.6) in place of (2.2). 

Proof of Theorem I. In the equations (2.1) the only singular point in the 
(x, v)-plane is (0, 0). For, clearly, dx/dt = v vanishes only if v = 0. Once 
v = 0, the other equation becomes dv/dt = g(x). But g(x) = 0 only at x = 0. 
Thus (0, 0) is the only singular point. 

We now introduce 


(2.8) A(x, v) = fv? + Giz). 

Since G’(x) = g(x), it follows that the curves A(z, v) = c must have negative 
slope when zx and v are both positive or both negative, and have positive slope 
otherwise. Clearly 


dx dv dx dv 
aa tO R=! (5 . a2). 


Or, using the equation in (2.1) for dv/dt, this becomes 
(2.9) — = —v f(z, v). 


Thus if f(z, v) > 0, then as ¢ increases the integral curves of (2.1) in the (z, v)- 
plane cut across the curves A(z, v) = c so that \ decreases while if f(z, v) < 0 
the integral curves cut the A(z, v) = ¢ curves so that \ increases. In par- 


5 All the results stated above as well as Theorem A are proved in Ivar Bendixson, Sur 
les courbes définies par les équations differentielles, Acta Mathematica, vol. 24(1901), pp. 1-88. 

6 This method has been used by one of the authors, O. K. Smith, in his thesis at M. I. T. 
(May, 1941) to prove Theorem I in a more restrictive form and with g(x) = z. The method 
has also been used by V.S. Ivanov in a paper which is reviewed in Math. Reviews, vol. 2 
(1941), p. 287. According to the review, however, Ivanov merely applies the method to 
the equation dealt with by Liénard. 











finite 
n, as 
lite it 
ular 


itself. 
wing 


—> © 


closed 


g the 
riodic 
on in 
oving 


n the 
Once 
= (), 


ative 
slope 


r, v)- 
<0 
par- 


1, Sur 
1-88. 
ee» 
ethod 
vol. 2 
od to 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 387 


ticular, since f(0, 0) < 0, around the origin the integral curves cut outward 
across A(x, v) = c. Thus the integral curves move outward from the ori- 
gin (which is the only singular point) as ¢ increases. (See AB in Figure 1.) 
On the other hand, since for |x| = x, f(z, v) = 0, it follows that integral 
curves for which | x | > 2» cut inward across the curves A(x, v) = ¢ as in Figure 1. 
(The direction of the curves sketched, CD moving to the right and EF to the 
left, comes from the equation dx/dt = v. When v > 0, this equation states 
that x increases as ¢ increases whereas, when v < 0, x decreases as ¢ increases.) 











Fia. 1 
We now introduce 
(2.10) vo = 2(G(a1) — G(a))’ 
and consider the curve 
(2.11) A(x, v) = 4vp + G(x) = ro. 
Clearly, 
(2.12) dy = 2G) — Gla) < 2G(a). 


The curve A(z, v) = Xo will be closed if it intersects the positive and negative 
z-axis. And this latter depends on the possibility of solving \(x, 0) = Xo, or, 
since A(z, 0) = G(x), on solving G(x) = Xo for positive and negative values of z. 
But \ < 2G(a:) and, by (2.6), G(x) for sufficiently large | x | exceeds 2G(2). 
Since G(x) increases with | x | and G(O) = 0, it follows that A(x, 0) = Ao has a 
solution for positive and for negative xz. Thus the curve \(x, v) = do is closed. 








388 NORMAN LEVINSON AND OLIVER K. SMITH 


This curve is abed of Figure 2. We next consider the solution of (2.1) which 
starts at the point (2), v), point A in the figure. We have seen that for 
|%| 2 2%, dd/dt S 0 along an integral curve and thus the integral curve we are 
considering cuts inward across the curves \(x, v) = cso long asx > 2%. Let us 
suppose the integral curve through A intersects the line x = 2, for the first time 
at B. We denote the value of \(x, v) at B by \y. From (2.9) and dx = vdt 
it follows that 


dx , 


(2.13) — —f(x, v)v. 
dx 


A(x, v,) 


*,“e 











Integrating (2.13) along the integral curve from A to B, it follows that 
rT 

(2.14) A — A= -| f(a, v)v dz. 
Zo 


Now either the v coordinate at B is greater than (or equal to) 3v or it is less 
than }v). Let us suppose the former is the case. Then, from (2.14), 


z 
Ai —r Ss —4n | f(x, v) dx 
Zz0 


and, using (2.4), 
A —_ Xo < —4}n(10M 2). 


Or 


(2.15) Ai S do — 5Maxm. 








hich 
for 
» are 
tus 
time 
v dt 


less 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 389 


On the other hand, if v at B is less than }v) , then 


\2 
m5 5 (3) + ote. 


Or, if we use the value of Xo in (2.12), 
Mh S G(x) — G(x)] + G(x) 
= $G(a1) — 4G (a) 
= io + iG(x») _ 3G (20) 
= $r0 + G(x). 


Thus 
Ar S Ao — FAo — G(x)] 
a a 1 v 
(2.16) aia | 


do — F (G(x) — G(a0)) 
Using (2.5), this becomes 
oo ; (20M2x) = X» — SMa. 


Thus, in any case, (2.15) holds if the integral curve intersects the line x = 2. 
Since \ continues to decrease along the integral curve for x > 2%, it follows 
that if \ at C (in Figure 2) on the integral curve be denoted by \» then A» S Ay 
and, therefore, 

(2.17) Ao S Ny — 5M oxov0 . 

In case the integral curve does not intersect the line x = 2; , it follows that it 
must cut the positive x-axis between x) and 2 at, let us say, x2. At this point 
(x2, 0), X = G(a.). But since G(x) grows with | x |, G(x) < G(2x,) and, there- 
fore, at (z2, 0), AX < G(x). Since \ decreases as the integral curve proceeds 
to C, it follows that \ at (C, Az) is less than \ at (x2, 0). Thus, in this case, 


Ae < G(x) = $Xo + 3G (xo) 
do — 3[\o — G(x)]. 


Proceeding as in (2.16), 


Ae S Ao — 10M xn < Ay — SMa. 


Thus in any case (2.17) holds. 








390 NORMAN LEVINSON AND OLIVER K. SMITH 


We now assume that the integral curve cuts the line x = —xz) at D. Denoting 
at D by A; and integrating (2.13) from C to D, 


-_ 
As — Ae = -|[ of (x, v) dx 
Zo 
or 


_ 
As — Ae = -| lv | f(a, v) dx 
zo 


zr 
M [ |v | dx. 
70 


Now if, along CD, |v| < u, we have 


IIA 


(2.18) As es de < 2M xv e 


Otherwise, if | v | exceeds vp along CD, we denote by P the point where | v | = 0 
for the first time between C and D and denote the coordinates of P by (24, v9) 
and \ at P by \;. Then, as above, integrating (2.13) from C to P, 


zo 
M—-MSM / lv|dx < 2Mxav. 


Or, since \ = 40” + G(z), 
bys — 4v2 S 2M aw + G(x) — G(x) 

S 2Mavo + G(x). 

On the other hand, by (2.17), Xx — Axe = 5Maowo , or 
vs — 403 => SMa. 
Thus 2Mzy + G(x) 2 5Mau. Or 
G(x) = 3Mzowo = 6Mx(G(a1) — G(x)’. 

But since G(a,) — G(a) = G’(a0)/M’zi , this gives 

G(x) 2 6G(x), 


which is impossible. Thus | v| < uv and therefore (2.18) holds. 
Combining (2.18) with (2.17), we have 


As < Ao — 3Man. 


Since from D to E, d\/dt < 0 along the integral curve, it follows that \ at EZ, 
which we shall denote by \y, is less than 43. Thus 


Ma < Ao — 3M am. 


If we now proceed from E£ to F in much the same way as from C to D, it follows 
that \ at F is less than X» which is the value of \ at A. Since \ = 4v° + G(z2), 








ting 


= Vo 
? Uo) 


at E, 


lows 


G(z), 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 391 


and x = 2 at both A and F, it follows, therefore, that v at F is less than v at A. 
In other words, F lies below A as shown in Figure 2. 

The above conclusion is valid if the integral curve intersects the line x = —2%. 
If this does not occur then the integral curve cuts the z-axis at some point D’ 
between 0 and —2. If the path from C to D’ is treated in much the same way 
as that from C to D and that from D’ to F in much the same way as from E to F, 
our conclusions will still follow. 

We now denote by the region R the region which is bounded on the outside 
by ABCDEFA, and on the inside by the curve \(z, v) = 6, where 4 is chosen so 
small that in the interior of A(z, v) = 6, f(z, v) < 0. Then no integral curve 
which starts in R will ever leave R as ¢t increases. For in the first place such a 
curve will not cut across ABCDEF since the latter is an integral curve and no 
two such curves intersect in a region free of singular points. In the second 
place, because dx = v dt, x increases with ¢ in the upper half planev > 0. Thus 
any integral curve in R and near the line segment AF will move away from AF 
to the right and therefore never intersect it. Hence the outer boundary of R 
will not be cut by any integral curve starting in R. Again, since along the inner 
boundary f(x, v) < 0, it follows from 


dx P 
te —f(x, v)v 
that \ increases with ¢. Thus, integral curves in R near the inner boundary 
move outward from the inner boundary X(z, v) = 6 as ¢ increases and, therefore, 
never intersect the inner boundary. Therefore the boundary of RF is inter- 
sected by no integral curve which starts in R. By Theorem A of Bendixson, 
already referred to, this means that there is at least one closed integral curve 
which lies in R. This proves Theorem I.’ 


3. Conditions for a unique solution. Here we shall consider a condition 
which assures that the generalized equation for relaxation oscillations, (1.4), 
has, except for translations in ¢, only one periodic solution. This is equivalent 
to proving that there is at most one closed integral curve in the (z, v)-plane. 
From Theorem I we know of course that there must be at least one closed curve. 

The method used here in proving that under certain conditions there is only 
one closed integral curve depends on the fact that two adjoining closed integral 
curves of (2.1) cannot both be stable. (A closed integral curve is stable if any 
integral curve starting sufficiently close to it spirals nearer and nearer to the 
closed integral curves as t — ©.) 

To prove that two adjoining closed integral curves cannot both be stable we 
consider two stable closed integral curves J; and J; of (2.1). Let J. be interior 
to J,;. Ris the region bounded outside by J, and inside by 72. Obviously R 


7 Because of the form of the equations (2.1) it is also very easy to show how, directly 
without reference to Bendixson’s theorem, a consequence of F lying below A in Figure 2 is 
that there must be at least one closed integral curve. 








392 NORMAN LEVINSON AND OLIVER K. SMITH 


is free of singular points. We shall show that in the interior of R there must 
lie at least one closed integral curve J; . 

For, let us suppose that there is no closed integral curve in the interior of R. 
Let A, in Figure 3, be the point in which J, cuts the z-axis for x > Oand B the point 
where J. cuts the x-axis forz > 0. Then every integral curve which starts on the 
line AB must, as > «, move closer and closer to J, or to J,. This follows from 
the theorem of Bendixson used in §2, Theorem A, and from the fact that we have 
assumed that there are no closed curves in the interior of R. Consider a point A; 
on AB close to A. Then the integral curve starting at A; must, because of the 
stability of J; , be asymptotic to J, and, therefore, as ¢ increases, it must cut the 
positive z-axis at a sequence of points which moves steadily from A; to A 











Fic. 3 


Similarly, the integral curve starting from a point B, sufficiently near B must 
approach closer and closer to Jz. Moreover, if the integral curve starting at a 
point P on AB is asymptotic to J, , then, since no two integral curves intersect, 
it follows that an integral curve starting from any point on AB to the left of P 
must also be asymptotic to J;. The corresponding result for J, is equally true. 
Thus we have determined a cut in AB. This cut determines a point which we 
shall denote by C. The integral curve which starts at C and makes one turn 
about O must again intersect AB at C, for suppose it intersects to the right of C. 
Then by the continuity of integral curves with respect to changes in initial condi- 
tions, integral curves starting to the left of C, but very close to C, would also 
intersect the z-axis to the right of C after one turn and thus be asymptotic to J, . 
But integral curves starting to the left of C must be asymptotic to J,;. Thus 
the integral curve starting at C cannot intersect AB to the right of C. In the 








must 


of R. 
point 
mn. the 
from 
have 
int Ai 
of the 
it the 
to A 


must 
gata 
arsect, 
t of P 
y true. 
ich we 
e turn 
t of C. 
condi- 
d also 
to I, ° 

Thus 
In the 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 393 


same way it cannot intersect AB to the left of C. Thus the integral curve at C 
is a closed curve 7;. This contradicts our assumption that J; and J, are two 
adjoining closed integral curves. Thus we have proved that two adjoining 
closed integral curves of (2.1) cannot both be stable. 

We shall next show that under certain conditions every closed integral curve 
of (2.1) is stable. But this fact and the fact just demonstrated that there 
cannot be adjoining closed integral curves, both stable, means that there is at 
most one closed integral curve under these conditions. Thus theorems on the 
existence of a unique closed integral curve are reduced to theorems on stability. 

Let us now consider the general equation for relaxation oscillations, (2.1), 
where f(x, v) and g(x) are subject to the conditions of Theorem I. As in §2 we 
consider the curves \(z, v) = c. Further, we denote by R, the region in the 
(x, v)-plane where f(x, v) is negative and by R, the region where f(z, v) is positive. 
We shall denote that part of the curve A(z, v) = ¢ which lies in R, by Ri(c) 
and that part of A(x, v) = ¢ which lies in R, by R2(c). 

THEOREM II. If the requirements of Theorem | are satisfied and if for every c 
the minimum of 
1 4 1 = af(x, v) 
vy sof (x, v) = ov 
on R2(c) is positive and exceeds the maximum of F(x, v) on Ri(c), then the equa- 
tions (2.1) possess a unique periodic solution.” 


F(z, v) = 


As we have seen, the proof of this theorem will follow at once if we prove that 
under the conditions of this theorem every closed integral curve is stable. We 
shall therefore determine the condition for stability by the now classic method of 
Poincaré. We need only concern ourselves with the formal aspects of the condi- 
tion here since the rigorous analysis is well known. Let us consider an integral 
curve. We denote the point at which this curve cuts the upper v-axis by u . 
We denote this curve by v(x, v). Then let us next consider the integral curve 
which starts at the point (0, vo + duo), where dv) is small. Then 


dv(x, Vo + hévo) 


v(x, Vo + bro) = v(x, vo) + duo 
OVo 


where 0 < h <1. Or, if we denote v(x, v + do) — v(x, v9) by Av, then 


dv(x, vo + hévo) 


0) Av = 6u 
(3.0) r Vo on 


If v(x, vo) is a closed integral curve, then for stability, it is necessary and suffi- 
cient that the integral curve starting at (0, v + 6vo) first again intersect the 
positive v-axis for increasing ¢ at a point between v and v + dv. In other 


8 Actually the requirements of this theorem need only hold for c < A where A is a con- 
stant such that it is known that there are no closed integral curves in the region A(z, v) 2 A. 
From the proof of Theorem I it is clear that A can be taken as Xo . 





394 NORMAN LEVINSON AND OLIVER K. SMITH 


words, that here | Av| < | 6v| for sufficiently small values of dv. Since 
dv/dv is continuous, by (3.0) this is equivalent to | dv(0, v)/dv. | < 1 when the 
positive v-axis is intersected for the first time as ¢ increases after starting from 
(0, vo) along the closed integral curve. From 


dv _ g(x) 
dx _ f(z, v) ° 
it follows for v(x, vo) that 
d dv _ _ Of & , g(x) d 
dx Avo Ov duo v2 duo 


Thus 


iz (én) 
dx \Ov/ _ ~F 4 @) F118 _ ! Wa, v). 


Ov Ov v? a vdr v 
Ovo 
Or, integrating between two points A and B, 


av} |? _ [ [z 1 | ; P 
log. I. = a) + ; (a, v) |dx — log, | v| . 


In particular, integrating once around a closed integral curve we take A and B as 
(0, v). At A, the starting point on the positive v-axis, dv/dv = 1. Moreover, 
Va = Vea = U0. Thus 


Ov) _ -| of , 1 


where this last integral is extended around the closed integral curve. The condi- 
tion that | dv/dv) |x < 1 is clearly equivalent to 


— or 
{|% + yz, ») Jaz > 0. 


Since dz = v dt, this last integral can be written as 


, Sf ; 
(3.1) [le 3p + f(z, 0) Jae > & 


where the integral is taken around the closed integral curve. This, then, is a 


condition for stability. 
We next concern ourselves with an integral which in certain applications is 
equivalent to an energy consideration. From (2.1) we have 


( ) P 
vgld(z, v)] \z + of(z, v) + aa) = 0, 








Since 
n the 
from 


| Bas 
over, 


ondi- 


|, is & 


ons is 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 395 


where ¢(A) is any integrable function of \. This equation can be written as 


ad(z, 0) 


ela v @{A(z, v)] f(x, v) = 0. 


glA(z, v)] 


Integrating with respect to ¢ around a closed integral curve, the first term drops 
out. Thus we have 


(3.2) [ Fone, v)|f(x, v) dt = 0 


on a closed integral curve. 


Av) = u+8 











Fic. 4 


Now let #(A) = 1/6 when u <A <u+é and ¢(A) = 0 otherwise. If abcdefgh 
in Figure 4 represents a closed integral curve then 


/ v’ p[A(z, v)] f(a, v) dt = Tf “ [ + [ oa ia v' f(x, v) dt. 
Let 


Gi(u) = i v [r(z, v)] f(z, v) dt, 


G2(u) -f v’ dlA(a, v)I f(x, v) dt, 


where the first integral is extended along that part of the closed integral curve 
in R, and the second, in R;. From (3.2), 


(3.3) Gi(u) + G.(u) = 0. 








396 NORMAN LEVINSON AND OLIVER K. SMITH 


Now let 
lim G,(u) = gi(u), lim Ge(u) = ge(u). 
6-0 56-0 
Then from (3.3), 
(3.4) gi(u) + go(u) = 
. is 
(3.5) = > f(z, v) |=, 
where the sum extends over the points where the closed integral curve is cut by 
A(z, v) = win R,. A similar result is true for go(u). dt/dd is the reciprocal 


of d\/dt which is the velocity with which the curves A(z, v) = c are cut in phase 
space by the integral curve. 


Next let 
H,(u) = [ [te 4 1] ox a, v) |v’ f(x, v) dt 


and 


Hu) = [ [E+ fein v) |v’ f(a, v) dt, 


where the first integral is extended along that part of the closed integral curve 


in R and the second, in R.. Let 


lim H,(u) = hy,(u), lim Ho(u) = he(u). 
5-0 


5-0 


Then clearly 


(3.6) [[? of + - t | #fe, 0) v) dt = [ [hi(u) + ho(u)| du. 
uf dv 0 

Much as in (3.5) 

0" 1 of dt 
wv. ) — 
(3.7) h(u) = Bs Ps Le “f(z, ») ||, 
where the sum extends over the points where the closed integral curve is cut by 
(xz, v) = u in R, and a similar result holds for ho(u). Since go(u) = O and 


gi(u) S 0 it follows from (3.5) and (3.7) that 


1 of 
> 
hy(u) 2 mack Fe + - | gi(u), 
and similarly 
of 
> 
ho(u) = Lee 32 ar ge(u). 








ut by 
rocal 
nhase 


curve 


ut by 


) and 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 397 
Thus, by the statement of Theorem II, 


hi(u)e+ ho(u) = min 32 - 4 (gi(u) + ge(u)). 
v v? 


Ro(u) Ov 
Or, by (3.4), 
(3.8) hi(u) + he(u) = O. 


: 1a ? ‘ : , 
Now unless f + — is constant on the curves (2, v) = c, the inequality sign 


of dv 
in (3.8) must hold for almost all u. Thus 


| [hi(u) + ho(u)| du > 0 
0 


or, by (3.6), 


lof ,1]|-. | 
{liz + a f(z, v) dt > 0. 


But this is (3.1) which assures stability on every closed curve. This proves 
Theorem II except for the case where (1/vf)(df/dv) + 1/v” is constant on the 
curves A(x, v) = c. This last case is excluded by the hypothesis of Theorem I 
as we shall now show, for it implies that (1/vf)(df/dv) + 1 v is a function of 
A(x, v). That is, 


1a = \) 
- z= log of = ¥(30" + G(x)). 


Or, solving, 


f(z, v) = = exp / vy (4u” + G(x)) dv, 


where x is held constant during the integration in v. Clearly, no matter what 
function y is, this implies that f(z, 0) is infinite for almost all x and, therefore, 
that (2.3) in the hypothesis of Theorem I cannot be fulfilled. 

Theorem II has several useful corollaries which we shall now state and prove. 


THEOREM III. Jn the equations 


dv , oJ a 

a +) + 9) = 0, 
f(x) and g(x) are differentiable functions. There exist an x, > 0 and an x, >0 
such that f(x) < 0 (— 1 < x < 2) and f(x) 2 0 otherwise. Also xg(x) > 0 
(|2| > 0). Further, let 


(3.9) [ g(x) dx = l f(z) dx = ~, 
0 








398 NORMAN LEVINSON AND OLIVER K. SMITH 


Also let 


g(x) dx = G(x) 


0 
and suppose 

G(—21) = G(x). 
Then it follows that the equation has a unique periodic solution except for transi- 
tions in t. 


Note that f(z) need not be even nor g(x) be odd, but that if this is the case 
then the requirement G(—2z,) = G(z2) is automatically satisfied. 
Here the conditions of Theorem II become simply that 


ie 1 
(3.10) min — 2 max —. 
Ro(c) U Ry(c) VU" 


But R; is the strip —z, S x S x, and R,isx = x, andzx S —xz,. Recall the 
fact that the curves A(z, v) = c have negative slope in the first and third quad- 
rants of the (z, v)-plane and positive slope in the second and fourth. Also, 
since G(—2:) = G(22), it follows that 40° + G(x) = c intersects the lines x = —2; 
and x = 22 for the same positive and negative values of v, that is, let us say, 
at +v.. But from this it follows at once that 
oe 1 1 1 

min = = 5, max — = =. 

Ro(c) U Ve R,(c) UV ve 
Thus (3.10) is satisfied and hence Theorem III is a consequence of Theorem II. 
The other conditions in Theorem III not used so far simply assure that the 


requirements of Theorem I are satisfied. 
A more general corollary of Theorem II which includes Theorem III as a 


special case is 
TueoreM IV. Let the requirements of Theorem I be satisfied. Further, let 
there be an x, > 0 and an x2 > 0 such that 


f(z, v) <0 (—-m4 < © < &) 


and f(x, v) 2 0 otherwise. Let G(—2,) = G(2x2) and suppose 
of 
~ >. 
= 0 


Then (2.1) possesses a unique periodic solution except for translations in t. 


The proof of this theorem goes in much the same way as the preceding one 
in so far as 


(3.11) min 1 > men 5 


Ra(c) V® Rae) U* 








ansi- 


case 


g one 








GENERAL EQUATION FOR RELAXATION OSCILLATIONS 399 


goes. Further, by v(df/dv) = 0, 
1 19, 
f(z, v) v dv ~ 
in R, since here f = 0, and 


1 af — 
jo av = 


in R,, where f < 0. From this and (3.11) it follows at once that 


is oa 1, lof 
4 “+ | > max] — m 
Bate) li ¥ jo 4 = te) E ¥ fo 4 
and thus Theorem IV is a consequence of Theorem II. 


4. Liénard’s method. Here we shall consider the equation 
(4.0) & + f(x) + g(x) = 0, 


where f(x) is an even function such that for the odd function 
F(x) = [ f(x) dx 
0 


there exists an x with F(x) < 0 for 0 < x < x2, and F(x) > 0 and monotoni- 
cally increasing for x > 2%. Moreover, g(x) is an odd differentiable function 
such that g(z) > 0 forz > 0. We further assume that 


(4.1) I f(x) dx -| g(x) dx = « 


although milder conditions of the type given in Theorem I would suffice. Under 
these conditions we shall show that (4.0) possesses a unique periodic solution. 

For the case g(x) = x this result has been demonstrated by Liénard. Here 
we shall modify the proof of Liénard so that it applies to (4.0). The result 
here is more inclusive than Theorem III of §3 in so far as the requirements on 
F(x) go, but more restrictive in requiring that f(x) be even and g(x) odd. 

As before, we set ¢ = v. Then (4.0) becomes 


(4.2) dvs a) + = 0. 
dx v 
We now introduce y = v + F(x). Then (4.2) can be written as 
dy giz) _ 
ae dz * y — F(x) © 


Clearly, a unique periodic solution of (4.0) is equivalent to a unique closed 
integral curve for (4.2) which in turn is equivalent to a unique closed integral 
curve for (4.3). Moreover, since (4.3) remains unchanged if (z, y) is replaced 








400 NORMAN LEVINSON AND OLIVER K. SMITH 


by (—z, —y) it follows that if a closed integral curve passes through (0, yo) 
it must also pass through (0, —yo). For if this were not the case the reflection 
in the origin, that is, replacing of (x, y) by (—2z, —y), would give rise to another 
closed integral curve which intersects the first one. But two integral curves 
cannot intersect, except at the origin, and thus a closed integral curve starting 
at (0, y) passes through (0, —y). In fact, what we have really shown is that 
a closed integral curve is symmetric with respect to the origin. Conversely, 
any integral curve starting at (0, y) and passing through (0, —y) must be 
closed since on leaving (0, —y) it must follow the reflection in the origin of 
the path from (0, yo) to (0, —yo). Thus to find a closed integral curve of (4.3) 
is equivalent to finding an integral curve with positive and negative y inter- 
cepts equal. 

In showing that there is only one integral curve with positive y intercept equal 
to its negative y intercept, we shall study the change in intercepts of the integral 
curves by studying how these integral curves cut across the curves \(z, y) = ¢, 
where 

A(x, y) = by’ + G(z). 


As in the previous articles 
G(z) = [ g(x) de. 
0 


Clearly \(x, y) is symmetric with respect to changes in sign in both z and y. 
Thus, if \(0, y) is the same for an integral curve for both positive and negative 
y, then the integral curve is closed. 

Looking at (4.3), it follows at once that, for x > 0, integral curves have 
negative slope where y > F(x) and positive slope where y << F(x). The slope 
is infinite where y = F(x). Thus in Figure 5 ACB, A’C’B’, and A’C’’B” 
are all integral curves. The equation (4.3) can be written as 

ydy + g(x)dx = F(x)dy 
or as 
(4.4) dX(x, y) = F(x)dy. 
We now consider a section of an integral curve ACB such that C, the inter- 


section of the curve with y = F(z), falls in the strip 0 < x < x. In this strip, 
F(x) <0. Moreover, from A to B,dy <0. Thus F(x)dy > 0 and from (4.4) 


[ dx(z, y) > 0 


or, in other words, \x» — 44 > 0. Thus, OB > OA. 
Next we consider integral curves which intersect y = F(x) to the right of 
xz = 2%. A’B’C’ and A”B”’C” are two such curves. From (4.3) and (4.4) 


_ —F(x)g(x) 


(4.5) dX(z, y) = > Pe dz. 





a | 


TR 





GENERAL EQUATION FOR RELAXATION OSCILLATIONS 401 


Since —F(x) > O for 0 < x < 2 and since y — F(z) is greater along A’’G than 
along A’E, it follows from this equation that 


G E 
| dx(x, y) < / dxr(a, y), 
a’? A’ 


each integral being taken along the proper integral curve. In other words, 


\4.6) Ae —_ Nar < Ng aaa Na e 











Fia. 5 
From dA(x, y) = F(x)dy, it follows since F(x) > 0 along GH that dX(z, y) < 0. 
Thus 
(4.7) Aw — Ag < 0. 
Since for the same y, F(x) along HI exceeds F(x) along EF, it follows from 


d(x, y) = F(x)dy that 


[ dx(2, y) < [ dr(x, y) 








402 NORMAN LEVINSON AND OLIVER K. SMITH 


or that 
(4.8) Ar — Ag < Ap — dg. 
Just as along GH, so also along JJ it follows that 
(4.9) Ay — Ar < 0. 
Similarly just as (4.6) is obtained we get 
(4.10) Agr Par Ay < Apr —_ Ar . 
Adding (4.6) through (4.10), we get 
Ap = Nar < Ap — Nar . 

In other words, 

OB” — OA” S OB’ — OA’. 
In other words, OB — OA > 0so long asc lies in 0 < x S x. When C moves 
outward along y = F(x) and cuts y = F(z) for x > a, then OB — OA isa 
monotonically decreasing function. Clearly, then, OB can equal OA at most 
once, meaning there is at most one closed integral curve. From our general 
theory, Theorem I, we know there is at least one closed integral curve and thus 
we are through. However, since the proof so far has been quite elementary, 
we shall give an independent proof that OB — OA becomes negative as A 
moves up the y-axis. To see this we have only to observe that the increase 
in \ in going from A” to G and from J to B” is monotonically decreasing as 
A” moves up the y-axis and is therefore bounded. From G to J, \ decreases. 
Consider the intercepts of the integral curves on the line x = 2x) as A”’ moves 
up the y-axis. Since by (4.3) 


g(x) 
dy = — ——__. dz, 
4 y= F@) 
it follows that starting at A”, in the interval 0 < x < 2m, 
Ag: g(x) dx 
|dy| s . ; 
oy | ly — F(2za») | 


Or if yo denotes y(22), we have on integrating 


A 


OA” yo qe 
“Sieed, ie (OA” — yo) F(2x) S G(2a). 


Or 
(OA” —_ Yo) Ee Yo _ Fax | < G(22). 


as a 2G (22) = 
OA" — 3 04" — 2FQn)" 

















GENERAL EQUATION FOR RELAXATION OSCILLATIONS 403 


Thus, as OA” — «©, y — OA”. Since no two integral curves intersect, it 
follows that the distance between the intercepts on x = 2zo of an integral 
curve for which 0A” — © is also tending toward infinity. But dA = F(x)dy. 
Since, for x > 2%, F(x) > F(2x%), dX < F(2x)dy. Integrating between 
the intercepts on x = 2%, 


fa < —F(2x)D, 


where D is the distance between the intercepts. Since D — ~, it follows 
that the decrease in \ in going from G to J can be made arbitrarily large. On 
the other hand, the increase from A” to G and from J to B” is, as we have 
seen, bounded. Thus, for large OA, AX» — \4 < 0 or OB — OA <0. This 
completes the proof. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 





THE DIVERGENCE OF NON-HARMONIC GAP SERIES 
By Pxuitie HARTMAN 
It was recently shown’ that if Xo , A1, «++ is a Sequence of positive real numbers 
satisfying the gap condition 


NE 
Ne- 


then the convergence of the series 


(1) >q>1 (k = 1,2, +++), 
1 


. 
(2) p | Ae | 
k=0 
implies the convergence of 
« 
‘ id 
(3) ae 
k=0 


for almost all tj —2» < t < +, while if (1) is modified so that “gq > 1” is 
replaced by “g > 3(5' + 1)”, then the divergence of (2) implies the divergence 
of (3) for almost all t,; — 2» <t< +2. The object of this note is to show that 
the condition (1), without any modification, and the divergence of (2) implies the 
divergence of the series (3) for almostallt, —-~ <t<+. 

In order to prove this statement, let o(£) denote the completely additive meas- 
ure on the ¢-axis which has the non-negative density” 


(4) (1 — cos #) Tire rT 
at? 

so that, if 2 is a measurable set, 

(5) o(E) = | = oer a. 
E rt? 


Obviously, for any measurable set F, 
(6) 0 Ss of FE) 31. 


The Fourier-Stieltjes transform of this o-measure, 
+20 


(7) [ e™ de(t) = max (1 — |d|,0), 


vanishes for all |A| = 1. 


Received January 5, 1942. 

1M. Kac, Convergence and divergence of non-harmonic gap series, Duke Mathematical 
Journal, vol. 8(1941), pp. 541-545. 

2 The introduction of this measure function avoids the awkward construction, used in 
loc. cit., see footnote 1, of a measure whose Fourier-Stieltjes transform vanishes on a par- 
ticular sequence of points. It also makes it possible to use, with only the slightest of 
modifications, the method applied by A. Zygmund, Trigonometrical Series, Warsaw, 1935, 
pp. 120-122, in the case that the frequencies , are integers. 


404 








yeTS 


PaS- 


tical 


d in 
par- 
t of 
935, 











DIVERGENCE OF NON-HARMONIC GAP SERIES 405 


Let E be any measurable set and let m, n (n > m) be a pair of positive integers. 


Then 


[ D awe’ ref * do(t) = o(E) >| ay | +> daa f FORM de(t), 


k=m k=m k= - + =m 


By Schwarz’s inequality, the absolute value of the last term is majorized by 


(9) (x, > ay, 4; V(x p> / e'*-*)* dg(t) if 
. 


k=m k=m k=m k=m 
i 


To appraise the last expression, it is necessary to consider the structure of the 
set of numbers 4; — A; (k > j; j,k = 0,1,---). First, there exists a positive 


number 6 such that A. — A; > 6 for all k > j. In fact, by (1), 
he — Aj S Ae — Aer > A(1 — q) = (1 — q’) So that 6 may be taken to be 
(1 — q). Secondly, the number of numbers \; — \;,k > j, in any interval 


ex Sc+1,c > ,is bounded. Forife S \ —A; S$ec+1,thene S %& 
ande + 1 = (1 —q). In virtue of (1) the number of integers k such that 
gic + 1)/(¢q — 1) 2 & 2 Cis at most {log (ce + 1)/e + log q/(q — 1)}/log q, 
which is bounded for 6 S$ ¢c < +«. From these two properties of the numbers 
\. — Aj, k > j, it follows that the set of numbers \, — A; (k 4 7;j,k = 0, 1, ---) 
can be divided into a finite number of sequences, say N sequences, such that the 
absolute value of the difference of any two numbers in the same sequence exceeds 
1. The identity (7) implies that each of the corresponding sequences of func- 
tions e'*”’' forms an orthogonal sequence on — 2 < ¢ < +2 with respect 
to the o-measure. Hence, 0 the Bessel —— the series 


he—Aj)t 
>| fe pt R-*i)* g(t) 
p> j=0 


kx) 


converges (and has a sum which does not exceed N o(E)). 
Suppose that the integer m is chosen so large that 


9 


ee / e's) de(t) | < (40(B))? 
=m E 
Then, from (8) and (9), 


/ > ae’ | do(t) = }0(E) pa a |’. 
E | k=m 


=m 


This inequality and the finiteness (6) of the o-measure implies that if the series 
(2) diverges and if EZ is a measurable set such that (3) converges for all points ¢ 
on E, then o(£) = 0. But since the density (4) of o vanishes for only an enumer- 
able set of t-values, it follows from o(£) = 0 that the Lebesgue measure of E is 0. 
This completes the proof of the italicized statement. 


Fort Braaa, N. C. 





INFLUENCE OF THE SIGNS OF THE DERIVATIVES OF A FUNCTION 
ON ITS ANALYTIC CHARACTER 


By R. P. Boas, Jr. AND G. P6LYA 


1. Introduction. In what follows, f(x) denotes a real-valued function defined 
and of class C” in [—1, 1], i.e., possessing derivatives of all orders in the closed 
interval —1 < x < 1.’ 

Serge Bernstein investigated the analytic nature of functions whose deriva- 
tives are each of constant sign in [—1, 1]. That such a function is necessarily 
analytic in (—1, 1) is contained as a very special case in one of his earlier theo- 
rems. But he observed also that the signs of the derivatives have a certain 
influence.’ If no derivative of f(x) vanishes in (—1, 1), | f‘"(x)| is either 
steadily increasing or steadily decreasing; we have the first or the second case 
according as f"’(x)f\"* (x) > 0 or f\” (x)f\"*” (x) < 0 in (—1, 1) (consider the 
derivative of [f‘"’(x)}). We say that f‘”’(x) and f‘"’(xz) (where m < n) belong 
to the same “block” if | f°” (x) |, |fO"7? (2) |, ==, |S"? (a) |, | f°" (2) | all vary 
in the same sense, i.e., all increase or all decrease. Thus, f‘” (x) and f‘"*” (x) 
belong to different blocks if and only if 


FP(ayf"(z) <0. 


Let A: , Ax, As, -** denote the lengths of the successive blocks into which the 
sequence f(x), f’(x), f(x), --- is decomposed; we assume here that no block 
has infinite length, and that, therefore, there is an infinity of blocks. Bernstein 
found the remarkable result that the lengths of the blocks influence the analytic 
nature of the function. Roughly stated, the analytic nature of f(x) is simpler 
if the blocks are shorter. E.g., if the sequence \; , \2, --: is bounded, f(z) is 
(or, more precisely, coincides in [—1, 1] with) an entire function of exponential 
type, i.e., an entire function whose growth does not exceed order one and 


finite type. 


Received January 5, 1942. 

1We write [a, 6] for the closed interval a S zx S b, and (a, 6) for the open 
interval a < x < b. The conventions about f(z) do not apply to section 3. 

2S. Bernstein, (a) Sur la définition et les propriétés des fonctions analytiques d’une variable 
réelle, Mathematische Annalen, vol. 75(1914), pp. 449-468, (b) Legons sur les propriétés 
extrémales et la meilleure approximation des fonctions analytiques d’une variable réelle, Paris, 
1926; see especially pp. 196-197. Another proof has been given by R. P. Boas, Functions 
with positive derivatives, this Journal, vol. 8(1941), pp. 163-172. 

3S. Bernstein, (a) On certain properties of regularly monotonic functions (in Russian), 
Soobshcheniya Kharkovskogo Matematicheskogo Obshchestva (Communications de la 
Société Mathématique de Kharkow), (4), vol. 2(1928), pp. 1-11, (b) Sur les fonctions régu- 
ligrement monotones, Comptes Rendus Hebdomadaires des Séances de |’Académie des 
Sciences, Paris, vol. 186(1928), pp. 1266-1269, (c) Same title, Atti del Congresso Interna- 
zionale dei Matematici, Bologna, 1928, vol. 2(1930), pp. 267-275. 

Observe that since /(x) is analytic in (—1, 1) and not a polynomial, if no derivative changes 
sign in (—1, 1), then no derivative can vanish there. 

406 

















INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 407 


D. V. Widder found recently that f(x) is necessarily an entire function of 
exponential type if* 

(—1)"f°" (xz) = 0 (-lS2<s1;n=0,1,2,---). 
Widder’s condition would imply, if we knew that no derivative of f(x) vanishes 
in (—1, 1), that f°” (x) and f°"*” (x) do not belong to the same block, and that, 
therefore, no block has 9 length greater than two. But, in fact, Widder’s 
condition does not say anything about the non-vanishing of the derivatives of 
odd order in (—1, 1), and therefore Widder’s theorem is not contained in Bern- 
stein’s results. 

The following theorem contains both Bernstein’s and Widder’s results.’ 

THEOREM 1. Let {n,} and {qx} be sequences of positive integers, {n,} strictly 
increasing. Let f(x) be real-valued and of class C® in [—1, 1]. For k = 1, 
2,---, let f(x) and f\'"***™ (x) not change sign in [—1, 1], and let 

f "®) (x)f' Nk ak) (7) < 0. 

(1) If nme — me» = O(1) and q, = O(1), then f(x) coincides in [—1, 1] with an 
entire function of growth not exceeding order one and finite type. 

(II) If m — m4 = O(n), KC = O(nz), and m+ gq te: + a = O(n), 
where 6 is fixed,0 < 6 < 1, then f(x) coincides in [—1, 1] with an entire function 
of finite order not exceeding 1/(1 — 4). 

(111) Tf ne — mera = ofr), Qe = O(mx), and qi + Go +--+: + ge = O(n), 
then f(x) coincides in [—1, 1] with an entire function. 

In order to apply this theorem to Bernstein’s case, in which no derivative 
vanishes in (—1, 1), let f‘"*’(x) denote the last derivative belonging to the k-th 
block, so that 


fr (xf (x) < 0, 
Mi = 1, Ne = Me — Nea fork > 1. 
Put q, = 1; then the hypothesis of Theorem | is fulfilled, and the specialization 
performed gives us exactly Bernstein’s results." On the other hand, assuming 
that 2g, = mei: — nm, and that n is even, we obtain from Theorem 1 the following 
direct generalization of Widder’s result. 


THEOREM 2. Let {n,} be a strictly increasing sequence of positive even integers. 
Let f(x) be real-valued and of class C® in [—1, 1], and let 


(—1)*f™ (x) = 0 (k = : 2; eee ). 


*D. V. Widder, Functions whose even derivatives have a prescribed sign, Proceedings of the 
National Academy of Sciences, vol. 26(1940), pp. 657-659. 

5 The main results of the present paper were stated in a joint note by the authors (Gene- 
ralizations of completely convex functions, Proceedings of the National Academy of Sciences, 
vol. 27(1941), pp. 323-325), where previous contributions of both authors to the problem are 
quoted. 

6 See footnote 3, (a), pp. 4-5. 





408 R. P. BOAS, JR. AND G. POLYA 


(I) If n, — mea = O(1), f(x) coincides in [—1, 1] with an entire function of 
growth not exceeding order one and finite type. 

(II) If ne — mua = O(n), 0<sb< 1, f(x) coincides in [—1, 1] with an entire 
function of finite order not exceeding 1/(1 — 4). 

(IIT) If ne — ne» = o(nx), f(x) coincides in [—1, 1] with an entire function. 


We cannot prove Theorem 1 by Bernstein’s method because, under its hy- 
pothesis, many derivatives of f(z) may change sign in [—1, 1] (see (IV) in §6); 
and we cannot prove it by Widder’s method, because the sequence n; , M2 , M3 , -*- 
may be much more irregular than the extremely regular special sequence 2, 4, 
6,---. Our proof (see §5) is based on an inequality for derivatives, in which 
the constant sign of a derivative and the evenness of the difference between its 
order and that of another derivative are aptly combined (see Lemma 6 in §4). 

We thought it appropriate to include proofs of the main facts on which 
Lemma 6 is based (see Lemmas 1 to 5 in §§2 and 3). These facts constitute an 
important part of the technique of dealing with the derivatives of real functions, 
and our proofs are somewhat simpler than previous proofs. 

The last section of the paper (§6) is devoted to additional remarks and to the 
construction of examples showing that the results stated in Theorem 1 are 


fairly sharp. 


2. Derivatives of polynomials. We start from the following well-known and 
easily proved fact.’ 


LemMa 1. Assume that P(x) is a polynomial of degree not exceeding n, M is the 
maximum of | P(x) | in [—1, 1], z is a point of the complex plane, a is the larger 
and b is the smaller semi-axis of the ellipse with foci at the points +1 and —1 of 
the complex plane, passing through z. 


Then 
| P(z)| Ss M(a + b)". 


We use Lemma 1 to prove the following” 
Lemma 2. Under the hypothesis of Lemma 1, 


7 This theorem is due to S. Bernstein and plays an important réle in his work. For 
proof and further references see, e.g., G. Pélya and G. Szegé, Aufgaben und Lehrsdtze aus 
der Analysis, Berlin, 1925, vol. 1, section III, problem 270, pp. 137, 320. 

8 The result of Lemma 2 is less exact but the proof is much simpler than that of W. Mark- 
off, Uber Polynome, die in einem gegebenen Intervalle méglichst wenig von Null abweichen 
(in Russian), St. Petersburg, 1892; German abridgement in Mathematische Annalen, vol. 
77(1916), pp. 213-258. The method of proof is that of P. Montel, Sur les polynomes d’ap- 
proximation, Bulletin de la Société Mathématique de France, vol. 46(1918), pp. 151-192, 
157ff. See for the whole question the address of A. C. Schaeffer, Inequalities of A. Markoff 
and S. Bernstein for polynomials and related functions, Bulletin of the American Mathe- 
matical Society, vol. 47(1941), pp. 565-579. 











of 


re 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 409 


(1) P®(0)| s 3k n‘ M, 

(k) 9 "i OL 
(1*) P™(2)| s on nM 
fork = 1,2,---,n—land-1 S281. 


It is easy to see that the ellipse with foci 1 and —1 and semi-axes a and b 
contains the circle with center 0 and radius b, and also any circle whose center 
is a point x of [—1, 1] and whose radius isa — 1. Hence, by Lemma 1 and 
Cauchy’s estimate for the absolute value of the k-th derivative, 


(2) P®@| <a md M, 
ia (a + b)” 

° P™ (x) | s k! M. 

(2*) | z)|s a — ip 


We have still to choose the ellipse; we have, of course, the condition that 
9 
a —b =1. 


Under this condition, we seek the minimum of the right side of (2). The usual 
procedure leads us to the values 


n k n+k\} 
= . ij = = * 
eG — Ry Ge? — BY “ee (: - ') , 
substituting these values into the right side of (2), we obtain 
ain - tt@t+ eer 
(3) PPO)! SE @ — pein M 
Minimizing the right side of (2*) yields 
n+k 2nk n+k 
L = ij = - = 
a <3 "9 — Ke? ors .— k’ 


and these values change (2*) into 
“ 1 (Ie! k (n+k)/2) 2 
(3*) |P®(2)| $a le ads ot M. 
k!2 |k* (n — k) 2) 
Comparison of (3) and (3*) with each other and with (1) and (1*) shows that, 
from this point on, it is sufficient to consider (3). Now 
(n + ees SS nk tetein) 
(n — k)o-#R ~, ’ 


where we use the abbreviation 


(4) 


[((1 + x) log (1 + x) — (1 — 2) log (1 — 2)J/(2z) = g(x) (O< 2 <1); 


g(x) is defined by continuity in [0, 1]. We observe that in (0, 1) 








410 R. P. BOAS, JR. AND G. POLYA 


and therefore that, for0 < x < 1, 
g(x) < ¢(0) = 1. 
Combining this with (4), (3), (3*), and the well-known inequality 
kl < (2m) eto < akikte, 
we obtain both (1) and (1*). 
3. Derivatives of real functions. We use Lemma 2 to prove the following 


lemma.” 


LemMa 3. Let the function f(x) be defined and possess an n-th derivative in 
{—1, U],”° and let it satisfy in this interval the conditions 


(5) \f(z)| = Mo, =| f(z) | S Mn. 
Define 

(6) M’, = max (M,, n!Ml~"), 
(6*) M* = max (M,, n!M,(2l)~"). 
Then 

(7) if (0)| s 6kbe® Mo" Mi”, 


® Lemma 3 is essentially equivalent to two theorems due to A. Gorny, Contribution a 
Vétude des fonctions dérivables d’une variable réelle, Acta Mathematica, vol. 71(1939), pp. 
317-358. Our proof differs in two points from Gorny’s proof. (I) Instead of the best poly- 
nomial approximation to f(z), we use the approximation given by Taylor’s formula; this 
method was indicated (before Gorny) by O. Ore, On functions with bounded derivatives, 
Transactions of the American Mathematical Society, vol. 43(1938), pp. 321-326. (II) 
Instead of the best estimate for the k-th derivative of a polynomial, we use the approximate 
estimate of Lemma 2; the possibility of such a variant was hinted by Gorny (op. cit., p. 
321, footnote). 

Essentially equivalent theorems have been announced by H. Cartan. See H. Cartan 
and 8S. Mandelbrojt, Solution du probleme d’équivalence des classes de fonctions indéfiniment 
dérivables, Acta Mathematica, vol. 72(1940), pp. 31-49. More precise results, applying only 
to the interval (— ~, «), are given by A. Kolmogoroff, On inequalities between upper bounds 
of consecutive derivatives of an arbitrary function defined on an infinite interval (in Russian; 
English summary), Uchenye Zapiski Moskovskogo Gosudarstvennogo Universiteta, Ma- 
tematika, vol. 30(1939), pp. 3-16; the results of this paper are also stated in Une généralisa- 
tion de Vinégalité de M. J. Hadamard entre les bornes supérieures des dérivées successives d’une 
fonction, Comptes Rendus Hebdomadaires des Séances de |’Académie des Sciences, Paris, 
vol. 207(1938), pp. 764-765. 

(Added in proof). Cartan’s results have appeared in a publication which reached this 
country while this paper was in the press (Sur les classes de fonctions définies par des 
inégalités portant sur leurs dérivées successives, Actualités Scientifiques et Industrielles, no. 
867, Paris, 1940). Cartan’s proof, like ours, uses Taylor’s formula. 

10 In section 3 the function f(z) need not have derivatives of all orders. 








ving 


e in 


on a 
pp. 
oly - 
this 
‘ives, 
(II) 
mate 
bes 


irtan 
ment 
only 
unds 
sian; 
Ma- 
lisa- 
Pune 
‘aris, 


this 
- des 
, no. 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 411 


| _Irnenee 
eye) f(a)| = Fett Mi Me 


fork = 1,2,---,nand-lszsl. 


We shall prove in detail (7), which we must use later, and we shall indicate 
the points where the proof of (7*) differs from that of (7). We assume in the 
proofs that 1 = 1 (the general case is reduced to this special case by considera- 
tion of f(lz)). We divide the proofs into two parts. 

(1) We consider the polynomial 


-f’ n—1 p(n—1) 
P(z) = f0) + 2 4... 42 FO 
1! n! 
e) "£0 ) 
x (0a 
= fiz) — nt’ 


where 0 < @ < 1 (we use Lagrange’s form of the remainder of Taylor’s series). 
By (5) and (8), 


| P(x) | S Mo + 
n! 
in [—1, 1]; hence it follows by Lemma 2 that 
(k) | ! (k) " sk M,, 
(9) f°) | = |P™'O)| Ss 3k’ nr’ | Mo + J 
for k = 1, 2,---,n — 1. From (9), (7) can be obtained very quickly (see 


below). 
(I*) In order to prove (7*), we consider the polynomial in x 


oe g-my®,  ,¢- gyfer") 
P(x) = f(é) + i! + & Xa 9 
i (x fle + 0( )] 
= f(z) — == pf c= t—& 


where é is a fixed point in [—1, 1],and0 <@< 1. If zisin[—1, 1], it follows 
from (5) and (8*) that 


» Me 
|P(z)| = Mo +2 , 
n! 
and hence by Lemma 2 that 
ik) . hile oo ee 2" M,, 
(9*) ir"@o|=|P’e@|s kt 28 Mo + ~ 
fork = 1, 2,---,n — 1 and an arbitrary point é in [—1, 1]. 


(II) Returning to (9), we distinguish two cases. 








412 R. P. BOAS, JR. AND G. POLYA 


(Ila) We consider first the case in which 


M, , 
(10a) M< ~ — ay 


! 
see (6). If 0 < A < 1, the function f(Az) is defined for x in [—1, 1], and we 
may apply (9) to f(Ax) instead of to f(x). We obtain 


, i a J n 
| f'(0) 3k n' (1% + i" : ), 


IIA 


n! 


{ ll ) Mo oe e” M,, 


0)! < 3kh'n® (— . 
poo se (es 2) 
= 3k'[Mot* + e* M, 0"). 
We used the familiar inequality n"/n! < e", and we defined ¢ by 


ge 
n 


so that we are free to choose ¢ in the interval (0, 1/n). We choose ¢ so that the 
two terms in the square bracket in (11) become equal. This choice is ad- 
missible because, in virtue of (10a), 


sa 3173" te. } 
t= 1(H) <}(3) Sea eS 


We obtain from (11) by this choice 


kin 
f°) | < 3h 2Moe (*) ; 
Mo 


so that in case (10a) holds we have proved (7). 
(IIb) We consider now the remaining case, in which 


M,, ’ 
(10b) M, = = : M, = n! Mo. 


In this case, (9) yields directly 


\f”(0)| < 3k'n* 2M, 


'\ kin 
6k n* Mi! & ") 


n! 


\ 


IIA 


k 
6k n* Mit (<) x. 


so that we have proved (7) also under the condition (10b), and therefore com- 
pletely. 














INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 413 


(II*) In order to derive (7*) from (9*), we distinguish two cases. The first 
“ase is characterized by the condition 


M* = M, > n!M,2™". 


In this case, we consider f[& + A(x — &)] instead of f(x), the point & being fixed 
in [—1, 1]; we choose \ in (0, 1) and proceed as under (IIa). The remaining 
case is analogous to (IIb). 

We consider now a theorem of a different character. Let inf g(x) denote the 
greatest lower bound and let sup ¢(2) denote the least upper bound of any real- 
valued function g(x) in [a, b]. With this notation, we have the following 
lemma.” 

Lemma 4. Let the real-valued function f(x) possess an n-th derivative in the in- 
terval [a,b]. Then 


—Te ! 
inf | f'’(x)| Ss ; (, — .) sup | f(x) 


Let us consider, following Chebysheff, the polynomial 
(12) T(x) = cos (n are cos x) = 2” ‘x" + 


Let c denote the center of [a, b], so that a + b = 2c, and put 


2x — c)) 1 4 Z 
13) P(z) = T< (= 
= “ | b—a 3(,4) : 


If Lemma 4 is not true, f’" (x) never vanishes in [a, 6], and it may then be sup- 
° ° ° 12 e,¢ ° °” 

posed, without loss of generality, ~ to be positive in [a,b]. Moreover, if Lemma 4 

is not true, there exists an H such that for all z in [a, b] 


dike n! 4 f n! 4 \" 
. - *) 
(A) f(x) > 9 (; — .) H> 9 (; - .) f(a 


Now consider 

(14) g(x) = HP(x) — f(z). 

We know that P(x), defined by (12) and (13), takes at n + 1 points of [a, b] 
(arranged in decreasing order) alternately the values 1 and —1; in particular, 
P(b) = 1. It follows from the second inequality (A) that g(x) takes alter- 
nately positive and negative values at the n + 1 points we have just men- 
tioned, so that it vanishes at least at n different points of (a, b); and, in par- 
ticular, 

(15) g(b) = H — f(b) > 0. 


1 §. Bernstein, op. cit. 3b, p. 10. Our proof is a little different from the original one; 
we avoid using Fourier’s rule. For another proof, see J. Shohat, A simple proof of a formula 
of Tchebycheff, Tohoku Mathematical Journal, vol. 36(1932-1933), pp. 230-235. 

12 If f™(zx) is never zero, it cannot take both positive and negative values, by a well- 
known theorem of Darboux. 








414 R. P. BOAS, JR. AND G. POLYA 


We say that g(x) cannot vanish at more than n points of (a, b); otherwise, by 
Rolle’s theorem, ¢’(x) would vanish at n points, g(x) at n — 1 points, and 
so on; finally ¢ "'(x) would vanish once. But (see (14) and (13)) 


+ 


) — f(x) <0 (as2zsb) 
p=-*¢ 


(16) g(x) = H 3 (, 
by the first inequality (A). Thus g(x) has exactly n zeros in (a, b), ¢’(x) has 
exactly n — 1, ete. Moreover, the n — 1 zeros of ¢’(x) must separate the n 
zeros of v(x); therefore, between the last zero of g(x) and the point b, g(x) and 
y’(x) must keep the same sign (they obviously have the same sign in a right-hand 
neighborhood of this last zero). Thus, by (15), we have 


¢’(b) > 0. 


In the same way we have ¢”’(b) > 0, ¢’’(b) > 0, ---. But here we arrive at a 
contradiction, because, by (16), ¢°"’(b) < 0. To avoid the contradiction, we 
must discard (A), and so Lemma 4 is proved. 

We use Lemma 4 to prove the following lemma.” 


Lemma 5. If the real-valued function f(x) possesses an n-th derivative f” (zx) 
which is monotonic in [a, b], if | f(x) | < M in [a, b], and if 0 < 1 < 3(b — a), 
then, in {a + 1, b — J, 


' 1 /4\" 
fond | in) < = 
(17) lif"'(z)| s 9 (7) M. 


Because f'"’(x) is monotonic, the maximum of | f<” (a) |in fa + 1,6 — lis 
attained at one of the end-points of that interval, and this maximum coincides 
with the minimum of | f‘"’(x) | in the closed interval of length | which is sepa- 
rated from [a + /, b — I] by the end-point in question." Applying Lemma 4 to 
this interval of length 1, we obtain (17). 

It may contribute to our understanding of Lemma 5 to compare it with the 
following lemma on analytic functions. 


If the analytic function f(z) is regular, | f(z) | < M ina circle of radius r, and 
0 <1 < }r, then in the concentric circle of radius r — 1 


f(z) | < nM. 


1. A new lemma on derivatives of real functions. We are now prepared 
to prove 
13 See E. Landau, Uber einen Satz von Herrn Esclangon, Mathematische Annalen, vol. 


102(1929-1930), pp. 177-188. 
14 f(z), as a monotonic derivative, is continuous by the theorem of Darboux quoted 


in footnote 12. 














INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 415 
Lemma 6. If p and q are positive integers, and g(x) is real-valued, possesses a 
(p + 2q)-th derivative, and satisfies 
~ (p+2 
(18) g(x)| < M, g?** (x) < 0 


in [—1, 1], then 


lA 


g(x) < A?™**(p + 2q)’M 
in [—1, 1], where A = 30e*". 

The main point is that A is an absolute constant, independent of the choice 
of g(x), p,andgq. Itis also important that the estimate for g”’ (x) holds through- 
out [—1, 1], and not merely in a sub-interval. 

We put 
(19) p+22q-—l=n. 


By the second condition (18), g‘"’(x) is steadily decreasing. Therefore we may 
apply Lemma 5; we obtain from (17) 


|g" (x) | < nt (7) M (-l+AS2x851-A), 


where 0 <h <1. By this inequality and the first condition (18), the hypothesis 
(5) of Lemma 3 is fulfilled in [—1 + h, 1 — h]. In order to simplify the appli- 
cation of Lemma 3, we choose h, taking (6) into consideration, so that 


n'(#) M = n!M(1 — h)”; 


i.e., we put h = 4/5. With this choice (7) yields 
|g (0) | s 6k’ M**'"(n1M5")*"", 
|g (0) | S (30en)*M 


for k = 1, 2,3, --- ,n. 
By Taylor’s formula 


(20) 


mg) 4, 4 STGP NO , atg?™@ 

1! (2q¢ — 1)! (2g)! ’ 
where 0 < & < x. The last term is not positive, by the second condition (18)- 
Using (20) for the other terms (see (19)), we obtain in [—1, 1] 


(30en)? M (1 + - + ar . ‘) 


g(a) = g (0) + 


lA 


g°” (x) 


ll 


(30en)? Me™” 
< (30e-c)"n’ M. 


This, with (19), proves Lemma 6. 





416 R. P. BOAS, JR. AND G. POLYA 


5. Proof of Theorem 1. We consider the function f(x) of Theorem 1. Let 
M,, denote the maximum of | f‘”’(x) | in [—1, 1]. 

We know that both f‘"*(x) and f‘"***”’(x) keep a constant sign in [—1, 1]. 
We may assume, without loss of generality, that 


(21) frrrn"(a) SC 
in[—1, 1). It follows that 
(22) f(x) 20 


jn [—1, 1]. We now apply Lemma 6, putting 
g(z) = } ~=—t"(9). p= — N-1, q=,; M = Mn,-, : 


Condition (18) is satisfied since we have (21). Using (22) also, we obtain from 
Lemma 6 that 


(23) Mg § Ann — os + Oy) 'N., - 


M,, 
We also have 
(24) M,, <= A" 7°" (m + 2m)"M, 


M being the maximum of | f(x) | in [—1, 1]. We may consider (24) as the 
special case of (23) in which k = 0, if we write m = 0. By (24) and repeated 
application of (23) we obtain 

. 
(25) M,, S MA**atest:--t) TT (n, — naa + gy) ?™'. 

p=1 
This inequality (25) supplies us with an estimate for the derivatives f‘""(z), 
f(x), «++ f(x), --- in [—1, 1]. Starting from this, we can estimate the 
remaining derivatives at x = 0, using (7) of Lemma 3. This is the main idea 
of the proof; in order to supply the details, we must consider the cases (1), (II), 
and (III) separately. 

(1) The particular hypothesis characterizing case (I) is the existence of a 

positive constant B such that, for k = 1, 2,3, ---, 


(26) Ne — a < B, qe < B. 
From this and (25), we obtain 
(27) M,, < M(A****3B)" = MC”. 


We now apply inequality (7) of Lemma 3 to f‘™’ (x) instead of f(z), in the interval 
[—1, 1], with meu: — m and h (h = 1, 2, +++ , meg: — mx) instead of n and k. 
We must first (see (6)) estimate 


max (Mn,,, 5 (Mea. — me)!M,,). 








mi 


Let 


2m 


he 
ed 


‘al 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 417 


We find, using (26) and (27), 
max (M,,., , (mea: — m)!M,,) S max (MC™*, B™ 1" ™ YC") 


= MC™*', 
since, by (27), B < C. 
Lemma 3 now yields 
(28) SO) |< Ghee MIM ret (Cty h (mesa me) 
Using (26) and (27) again, we obtain 
fg *0) | < 6Be?uc™ @ = 1, & +++ ia @ ee 


so that we have, with an appropriate constant D, 
(29) f° (0) | < DC” (n = 0,1, 2,---). 


By (29), we see that the Maclaurin series of f(x) converges everywhere, and is 
the Maclaurin series of an entire function of exponential type. But this series 
actually represents f(x) in the interval [—1, 1] because the remainder after the 
term containing x" ’ approaches zero (use (27) and Lagrange’s form of the 
remainder). 

Thus we have proved case (I) of Theorem 1. Moreover, we have obtained 
a pattern which we may follow in proving the remaining cases. First, we es- 
timate the derivatives f'"'’ (x), f'"*’ (x), --- ,f'""’ (x), --- in the whole interval 
[—1, 1], using (25). Second, we estimate the other derivatives at the point 0, 
using inequality (7) of Lemma 3 as we used it here for (28). These estimates 
must show that the coefficients of the Maclaurin series of f(x) have orders of 
magnitude according with the respective statements of Theorem 1. The argu- 
ment showing that the series actually represents f(x) in [—1, 1] is the same in 
all cases, and need not be repeated. 

(II) The particular hypothesis characterizing case (II) is the existence of a 
positive constant B such that, for k = 1, 2, 3,---, 


(30) nm — 1 < Bnj, an < Bnj., 
(31) MQt@ate: ta < Bu. 
Then (25) yields 


k 
Mn, < MA™t28n Il (3Bni Np—Np-1 


p=1 
(32) on M(A'**" 3B)" none 
= MC" nj". 


In order to prepare for the application of Lemma 3, we consider 


nk+ ti) 5 b+. — 8 
max (Mn,.,, (Mixa — mm)! My,) S max (MC™**! nfi*", (Brygys)"***-™* MC™ m,"*) 


5 
MC™*' nit". 





418 R. P. BOAS, JR. AND G. POLYA 
Here we have used (30) and (32); by the latter, B < C. 
We now apply inequality (7) of Lemma 3 to 
yr"(a), [—1, 1], Nei — Me, h 
instead of to 
f(z), {[—J, dl, n, | 


we obtain, using (32), 


ne+h »phih 1—hA/(np+1—nz) nk+ 5nge+1\h/(ne+1—nk) 
frre terre” Ue ae 


(33) = wire iit ai 
<= (6e)" MC" (m + Ayre," 
where 
(34) R = (1 — a)g(n) + ag(N) — ¢(r); 
we are using the abbreviations 
(35) Ny = N, Nea = N, h/(nes1 — Me) = a, m +h= », 
(36) x log x = ¢(z), 
so that 
(37) (l—a)n + aN = ». 
But, 


g(n) = o(v) + (n — v)g’(v) + H(n — v)’e’(n), 
g(N) = o(v) + (N — v)p'(v) + AN — v)'9"(m), 
with n < » < »v < » < N; and therefore since (see (36)) ¢’(x) = 1/risa 
decreasing function, it follows from (34) and (37) that 
R = 3{(1 — a)(n — v)*o""(n) +a(N — v)"o’"(v2)] 
< 4a(1 — a)(N — n)’p""(n) S B(N — n)’e""(n). 
Returning to (36), (35), (30), we obtain 


2 —l 2 26 1 6—1,\-1 
R < (mai — me) < Braun, , Nea < m(1 — Bn). 


Using this, we see from (33) and (32) that there is a constant D such that for 
n=1,2,3,--- 
\f*(0) | < D'n™. 
By this inequality, the Maclaurin series of f(x) is that of an entire function of 
finite order not exceeding 1/(1 — 4). 
(III) This case is characterized by the conditions 


° N+ ° ° 
lim a 2 lim “ = 0, 
ko Ny k—co Ny 
and (31). The application of (25) and of Lemma 3 to this case follows closely 


the developments in case (II); carrying through the calculation, we are led 








for 


of 


ly 
ed 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 119 


again to the consideration of the same expression (34). So we leave this case 
to the reader, who may show that, given any positive e, we have for all suffi- 
ciently large k and n 


Mn, < (en), if'"(0) | < (en)". 


6. Examples and comments. Before discussing the separate cases of Theo- 
rems 1 and 2, we shall make a few rather obvious remarks concerning all cases. 

Since all properties of f(z) considered in Theorems 1 and 2 are unchanged by 
a non-fractional linear transformation, these theorems do not change substan- 
tially when we replace [—1, 1] by any other closed interval. Any function f(x) 
which satisfies the hypothesis of any case of Theorem 2 also satisfies the hypoth- 
esis of the corresponding case of Theorem 1, if we put 3(me41 — me) = qe. Con- 
sequently we are at liberty to choose another interval instead of [—1, 1], or to 
content ourselves with the consideration of one of Theorems 1 and 2 in the 
following discussion. 

(1) All cases of Theorem 2 are concerned with a function f(x) having an 


infinite sequence of derivatives of even order, f‘""’(x), f*"*’ (x), --- , which do not 
change sign in a fixed interval, the signs being alternately + and —. The case 
(1) adds the particular hypothesis that the sequence m , m2, --- does not in- 


crease more rapidly than an arithmetic progression, in the sense that np — my. 
remains bounded; and draws the particular conclusion that the growth of f(z) 
does not exceed some finite type of order one. 

Can we draw the same conclusion from a less restricted hypothesis? We have 
no complete answer to this question. However, the example which we shall 
discuss under (II) of this section shows that if the difference n, — n,—; increases 
as slowly as k‘, where e is a fixed positive number, f(x) may have order greater 
than one. 

Can we draw a stronger conclusion from the same hypothesis? Here we have 
a complete answer. No hypothesis whatever on the signs of the derivatives 
in a fixed interval can imply that the growth of the function is less than order 


one and finite type. In fact, given any sequence @, €&, °°: ,€.,°** , Where 
€, = 1 or —1, there exists a function g(x) of order and type 1, such that g‘"’ (x) 
has the sign of e, in (—log 2, log 2) for n = 0, 1, 2,---. Such a function g(x) 
is defined by 
: shies af ,.ef, ,%% 
(38) g(x) = e+ Tr T + + a + 
For, 

(n) s a Ens i Enso te . 

g” (x) =en+ T + 21 + ; 

(n) x | «|? ~ 
2 ee. eee ee ee 
(39) ag @ & 1 il T 2 ( > @ 


for |x| < log 2. 





420 R. P. BOAS, JR. AND G. POLYA 


Consideration of the particular function (38) leads to a complement which 
applies to all three cases of our theorems.” 

Theorems 1 and 2 remain valid if we change the inequalities which the derivatives 
of f(x) are assumed to satisfy in the following way. 

Instead of f'"’'(x) = 0 we assume only f"'(x) = —p", instead of f(x) < 0 
we assume only f'"'(x) S p", where p is a fixed positive number. 

First, by considering f(x/p) instead of f(x), we can reduce the general case 
to the special case where p = 1. Second, it is sufficient to prove the theorem 
in an interval [—/, /|, where / S 4, because shifting the interval does not change 
anything that matters, and any (“large’’) interval can be covered by a finite 
number of (“slightly’’) overlapping intervals of length not exceeding 1. Now 


construct the sequence & , &,°**,€.,°** according to the following rule. 
If it is assumed that f'"’(2) 2 —1, take e, = 1; if it is assumed that f'"’(x) < 1, 
take e, = —1; and take e, arbitrarily, e.g., ¢, = 1, if nothing is assumed about 


f(x). With these numbers e, we define g(x) by (38). Finally, we consider 


h(x) = f(x) + g(x)/(2 - e’). 


Inequality (39) shows that h‘"’(x) = 0 or h‘” (x) S 0 in [—1, I] according as it 
was originally assumed that f'"’'(x4) 2 —1 orf" (x) < 1. Then the original 
form of Theorem 1 (or Theorem 2) may be applied to h(lz). 

(Il) There exist entire functions of order exactly 1/(1—6) satisfying the hy- 
pothesis of case (11) of Theorem 2 (and, therefore, also the hypothesis of case (11) 
of Theorem 1). 

In other words, the conclusion concerning the order of f(x) which we deduced 
from the hypothesis of case (II) is the strongest possible. 

We are given 6,0 <6 <1. We put 


(40) = p,; 


(41) 2[k’] = mn (k = 1,2,3, ---): 


we use from now on the usual convention that [r] denotes the integral part of 
the real number r._ It is evident that the function 


7 J 


= nine a"? 
(42) f(x) = > (-1)?-2— 
: ! 
p=1 Np: 
is entire and that its order is exactly 1/(1 — 6) = p. We shall show that it 
satisfies the hypothesis of case (II) of Theorem 2. 
The integers n , m2, --- defined by (41) are even, and (see (40)) 
. —s 1-6 
lim (me — Nga)n, = p2~. 
k—20 
15 This complement was found in a conversation between one of the authors and Professor 
E. Artin. 











of 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 421 


Thus the condition n, — -. = O(n!) is certainly satisfied. We obtain from 
(42) that 
(—1)* f° (x) = Cnitkte — Mhtemmk 
(43) a =1+)> (-1) i , 
Me a m"* (Nise — Mx)! 


We introduce the abbreviations 
(44) mm = N, Nr, = N, 


and estimate the absolute value of the general term of the series thus: 


(=) la N—n 4 (1 4 N =o 4% N N—n)é x os 
n™ }] (N — n)! n {2r(N — n)}3(N — n)*¥—*e-O-™ 


P igi*” x ie Nn’ N—n 
{2r(N — n)}!\N — n ’ 


Now x’ /(x — n) is a decreasing function of x for x > n, as is easily shown by 
differentiation. Therefore, by (44) and (41), 


N° ness < Y(k + 1)” 
N—n~ Mu — Mm Ak + 1)? — 1 — k| 
(46) a : 
2° "(k + 1)” 
< < I 
pke-! — ] 


for sufficiently large k; observe that, by (40), p§ = p — 1. From this inequality 
and (45) it follows that, for large k, (43) is certainly positive in [—3e°, 3e*~*). 
This shows that the function (42) satisfies all the requirements of case (II). 

(111) The hypothesis of case (III) of Theorem 2 concerning the sequence 
My1,M2,°** ,Mm,*** , namely that 


‘ New 
lim a 
kw Mp 


cannot be replaced by the weaker hypothesis 


° N+ 
lim sup = a, 
k—>00 NE 
where a > 1, without invalidating the conclusion that f(z) is an entire function. 
In fact the following statement is true. 


If a > 1, there exist an increasing sequence of even numbers nj , M2, +++ ,Mk,*** 
and a function f(x), analytic in a given interval but not entire, such that 
k g( ng) ° ° 
(—1)'f'""' (x) > O in that interval and 


~~ . Ne+1 
(47) lim = @. 
kao My 





422 R. P, BOAS, JR. AND G. POLYA 


The construction is much the same as that of the preceding example. This 
time we put 


n, = 2[a*], 


and the function f(x) is again given by (42), but with 6 replaced by 1. Then 
evidently the radius of convergence of the series is finite (in fact, 1/e). Follow- 
ing (44), (45), (46) with obvious changes, we show that, for sufficiently large k, 
(43) is certainly positive in [—(a — 1)/(2ae’), (a — 1)/(2ae’)]. 

(IV) One might ask whether a function satisfying the hypothesis of Theorem 2 
necessarily has all its derivatives of constant sign in some interval, not neces- 
sarily the whole of [—1, 1]. If this were the case, Theorem 2 would be con- 
tained in Bernstein’s results. However, the following example shows that for 
any increasing sequence of integers m,, N2, +++ with the property Nes — nm = 2 
and for any sequence 4, @&,°-* (e& = +1), there is a function f(x) such that 
ef "*’(x) > 0 in [—1, 1], while the points at which the derivatives f(z) change 
sign form a set which is everywhere dense in [—1, 1]. 

Let a, a, @,-+- be points of [—1, 1]. Define Po(x) = 1, 


P,(z) = / ax, [ di, -*° [7 dz (n = 1, 2, -«-). 
a0 a1 an—1 


Then evidently 


(48) | P,(x) | = 2” (-lsSztsi1;n=0,1,2,---), 


A 


(49) P\ (x) = 1. 


Now let @y,-1 , @n.—1 , *** be a sequence of numbers everywhere dense in [—1, 1]; 
let a, = 0 when n is not a member of the sequence nm, — 1, m. — 1,---. Let 
An, = «b”™ (k = 1,2, --- ), withO <b < 4; let A, = 0 when n is not a member 
of the sequence m,, m,-°-:. Then 


(50) f(z) = D AnP,(z) 
n=0 
has the desired property. For, since fork = 1, 2,---,n — 1, 
z T1 Tna-k-1 
P(x) = | dx, | dx, ++ | dint; 
ay Ok+1 a@n-1 


by (48) the series in (50) is dominated by }> (2b)", and the formal series 


r= 


f(x) is dominated by >> b"2"*. Hence f“’(x) may be obtained by termwise 


n=k 


differentiation of (50). For x in [—1, 1], by (49), 


f(z) = An + D> AnPS"(2); 


n=ngtl 








——lCUTUhhrlUCUCT”ltC<ié—C 








INFLUENCE OF SIGNS OF DERIVATIVES OF FUNCTION 423 
by (50), 


ef" (x) > Efe: Zz hp" Q"-" 


n=n_tl 


nk = 2b 


On the other hand, «f‘"*”(x) increases in [—1, 1], has the value A,,-1 = 0 
at x = a,,-: , and hence changes sign at a,,-1 . 

(V) If in cases (I) and (II) of Theorems 1 and 2 we require only that the 
inequalities involving f‘"* (x) hold in an interval [—J, , ,], where J, does not 
approach zero too rapidly, we can still show that the Maclaurin series of f(z) 
is the Maclaurin series of an entire function. Sufficient restrictions on J, in the 
two cases are 


(I) kh ~ (k — o), 
(II) nl, > © (k — 0). 
Hence we may state, for example, 


If f(x) is analytic’® in [—1, 1], of in [-L., hi] f° (x) and f'"***® (x) do not 
change sign and f‘"® (x)f°"***® (x) < 0, where ne — m1 = O(1), qx = O(1), and 
kl, — ©, then f(x) is an entire function. 

It will be clear that the order of the entire function will be smaller, the less 
rapidly 1, — 0. 

To establish our assertion, we have only to make slight modifications in the 
proof of Theorem 1. It is convenient to suppose that > h >---. We 
apply Lemma 6 to the function g(x) = f‘"*- (hax). Inequality (23) becomes 


—(np—np—1) .—np—) +2 —np- 
My < a nk Ane Me—-i tT % (np — m1 + 2q..)"* nk a ' 


where M,,, now denotes the maximum of | f°? (x) | in [—1,, l,]. Since i, is a 
decreasing sequence, (25) is replaced by 


k 
M., < Mi,"* A™*?*"""+) TT (ny — nya + 2qy)-?1. 


p=1 


Nk 
Mn, <M (f) 
iP 


Following the rest of the proof of case (I) of Theorem 1, we find that for h = 


0,1, -+-* , Mer: — Me, 
net+h 
0) | s B(c)" 


Lesa 


Using (26), we obtain 


16 Or even belongs to a Denjoy-Carleman quasi-analytic class. 





424 R. P. BOAS, JR. AND G. POLYA 


with a suitable constant Z. Hence 


1 (ngp+h) 1/(ng+h) 
(0) ’ —» 0 
Nk h if d ' 


C 
— 0 


Lesa Me : 
i.e., if 
Niles > ©. 
But since k < nm S kB, this is equivalent to 
kh, @. 
Case (II) is treated similarly. 


DvuKe UNIVERSITY AND Brown UNIVERSITY. 








v 





THE DISTRIBUTION OF PRIMES 
By AuREL WINTNER 


1. Simple prime factors. If f(n) is a function of the positive integer n, let f; 
denote the set of the solutions n of f(n) = t, and let f,(x) be the number of those 
elements of f; which are less than zx. 

Thus, if f(m) is the number of distinct primes dividing n or is 0 according as n 
is or is not square-free, then n is in fy if and only if it is not square-free so that 
fox) ~ (1 — £(2)")x asx — «©. On the other hand, if m > 0, then f,,(z) is 
the number of those integers n less than x whieh are composed of exactly m 
distinct prime factors, a number usually denoted by 7,,(z). Apparently, it was 
observed already by Gauss’ that the prime number theorem, i.e., m(x) ~ 
a(log x)’, implies, for every fixed m (= 1, 2, --- ), the asymptotic relation 


(1) tm(x) ~ Ln(x), 


where 


m—1 


x(log x)~*(log log x) 
(m — 1)! 


Thus Li(x) + L(x) + --- = 2, although 
my(x) + mo(x) +--+ = [x] — fo(x) ~ 2/f(2). 


The latter anomaly presents itself also in case of the function f(n) = 6(n) 
which plays a central réle in the following considerations and represents the 
number of simple prime factors of n (for instance, 6(15) = 2, 6(60) = 2, 
6(24) = 1). Clearly, there exists for every n exactly one m for which the set 
6, contains n so that @,(2) + @(2) + --- = [x] ~ x However, for every 
fixed m, 


L,,(z) = 


(2) Om(x) ~ const. Lm(x), 


where 
€(2)¢(3) 

(6) - 

In fact, if m is fixed, an n is in 0, if and only if n = pi --+ pmj holds for m 
distinct primes pi, --- , Pm and for a j having only multiple prime factors each 
of which is distinct from p:,---, Pm. Since w(x) is the number of those 
integers less than x which are of the form p; --- pm, it follows that, in order to 
pass from (1) to (2), it is sufficient to show that pt 1/t has a finite 


const. = 


Received January 7, 1942. 

1C, F. Gauss, Werke, vol. 10, part 1, 1917, p. 11 and p. 17. For the remainder term, cf. 
E. Landau, Uber die Verteilung der Zahlen, welche aus v Primfaktoren zusammengesetzt sind, 
Géttingen Nachrichten, 1911, pp. 361-381. 


425 








426 AUREL WINTNER 


value, ¢(2) ¢(3)/¢(6), where 7 runs through the set of all positive integers possess- 
ing no simple prime factors. But the product of two arbitrary elements of the 
latter set is an element of this set. Furthermore, a prime power p’ is in the set 
if and only if k # 1. Hence, by Euler’s factorization, the sum py 1/z is the 
product of all factors 1 + 0 + p* + p~ +---. Since the latter sum, being 
equal to 1 + p /(1 — p''), is identical with the reciprocal value of 
(1—p )(1 — p’)/(1 — p’’), the assertion follows by applying to s = 2, 3, 6 
the product definition of ¢(s). 

These remarks will now be combined with certain general facts regarding addi- 
tive functions. 


2. Additive functions. A function f(n) is called additive if 
(3) f(mm2) = f(m) + f(ne) 
whenever 

(m,m) = 1, 


i.e., m; and ne are relatively prime (in particular f(1) = 0). Thus an additive 
f(n) can uniquely be characterized by an arbitrary assignment of the double 
sequence formed by the values f(q'), where g and k run through all primes and 
through all positive integers respectively. It also is clear that, if f’(n) denotes, 
for a fixed prime p, that additive function for which the value f’(q‘) is 0 or 
f(p") according as the prime q is or is not distinct from p, then f(n) = >> f’(n) 
for every n (it being understood that, although the summation runs through 
all primes p, the sum is finite for every n, since f’(n) = 0 whenever p exceeds a 
bound depending on n). It is known’ that, if f(n) is any additive function, the 
functions f’(n), f'(n), --- belonging to any finite set of distinct primes p, r, - - - 
are statistically independent and that 


co) 


(4) [ exp (iau) dp”(a) = (1 — p") > p* exp (if(p')u) (-~% <u<-), 
— 30 k=0 


where ¢” = ¢"(a) (— x <a < ~) is the asymptotic distribution function of the 
additive function f’(n) of n. 


3. Poisson’s law. Since the terms of the series f(n) = > f’(n) are statistically 
independent, and since the appearance of primes tor which f’(n) does not vanish 
is a rare event when n — ~, it is to be expected that, if the functions f’(n) 
belonging to the various primes p are sufficiently homogeneous with respect to 
p (in particular, if f’(p) = 1 for every p), then the structure of the additive 
function f(n) is subject to conditions very similar to those under which Poisson’s 
law of distribution can rigorously be deduced.’ 


2 Cf. P. Erdés and A. Wintner, Additive arithmetical functions and statistical independence, 
American Journal of Mathematics, vol. 61(1939), pp.713-721, more particularly, pp. 718-719. 

3Cf., e.g., A. Khintchine, Asymplotische Gesetze der Wahrscheinlichkeitsrechnung, Er- 
gebnisse der Mathematik und ihrer Grenzgebiete, vol. 2(1933), no. 4, chap. 2, pp. 16-20. 














THE DISTRIBUTION OF PRIMES 427 


A “random variable” is said to be distributed according to Poisson’s law if 
the distinct ‘‘states” of which it is capable form an infinite sequence S, , S2, --- 
in such a way that the “probability”, say [S,,], of the ‘state’ S,, has the value 
[Sm] = @€°’"*/(m — 1)!, where d is a positive number independent of 
m(=1,2,---). Clearly, the total probability, >> [S,], is 1 for every \. Since, 
as easily verified, }> m’[S,,] — (>> m[S,,])* =, the square root of d is precisely 
the standard deviation. 

Now let the “random variable” be identified with an additive function f(n) 
and let its m-th “state” S,, take place for those values of n for which f(n) attains 
the value m. Thus f(n) is one of the “states” S,, S:,--- for every n > 1 
if and only if every f(q') is a positive integer. Actually, the values attained for 
prime powers q‘ in which k > 1 are relatively unimportant. In fact, all that 
is needed is that the “probability” that f(n) be in none of the “‘states”’ S,, be 0; 
in other words, that only o(x) of the first x positive integers n should lead to 
values f(n) distinct from every positive integer m. It is clear from the definition 
of f(x) at the beginning of this paper that the ratio f,,(2)/x is the relative 
frequency (“probability a posteriori’) of the “state” S,, when n varies from 
n=1ton = x(n = 1 andn = z being excluded). 


4. The problem. If n could be restricted to the range 1 < n < xz, one might 
infer that fm(x)/x is identical with the value [S,,] (“probability a posteriori’), 
supplied by the deduction of Poisson’s law. But if n is restricted to a finite 
range 1 <n < 2, then the terms of f(n) = >> f’(n) are not statistically 
independent, since their statistical independence holds only in terms of the 
asymptotic distributions belonging tox — ~. Naturally, one can cansider the 
infinite n-range, on which the functions f”’(n) are statistically independent in 
this asymptotic sense (xc — «). But then Poisson’s law is not available since 
its deduction assumes finite Lebesgue measures (which are additive) and not 
asymptotic distributions (defining relative measures which are not, in general, 
additive). Accordingly, it cannot be inferred that, if m is fixed and x — ~, 
then the relative frequency of the “state” S,, is asymptotically represented by 
the corresponding Poissonian probability, i.e., that 


Im() cr 

x (m — 1)! 
as x — ©, where \ = X(z); it is understood that X = A(x) denotes the square 
of the standard deviation belonging to the finite range preceding x. Never- 
theless, the remarks made after (3) suggest that (5) is likely to be true for every 
fixed m if the underlying additive function f(n) satisfies certain Tauberian con- 
ditions; conditions which make legitimate the interchange of two limit pro- 
cesses (the latter are the step x — « and the process occurring in the standard 
deduction of Poisson’s law). It is clear from the purely Tauberian character 
of the problem that no argument based on probability can prove the heuristic re- 
lation (4). Actually, it turns out that, even in case of the simplest functions f(n), 
the validity of Poisson’s law (4) is equivalent to the prime number theorem, 


(5) 





428 AUREL WINTNER 


n(x) ~ x/log x. The conditions for the additive function f(n) which assure the 
truth of (5) will be introduced successively. 


5. The finite stage. Suppose first that >> f(p‘)’/p* converges for every fixed 
prime p, where k (= 1, 2, --- ) is the summation index. Then two differentia- 
tions of (4) at uw = 0 show that, if uw, and yu, are abbreviations for 
(1 — p') f(p')/p* and (1 — p’) > f(p')’/p*, the asymptotic distribution 
function, ¢” = ¢’(a), of the additive function f’(n) has Up and My as momenta 
of first and second order respectively. Hence, its standard deviation is 
3, where \, = nu, — u,- Thus, if p, r, --- is any finite set of primes, the 
standard deviation of the convolution ¢” *@ *--- follows by substituting 
A» = Mp — a. \, = we — w,, +++ into (Ap tHAr + °°> ). But, if p, 1, *** are 
distinct, then, as mentioned before (4), the additive functions f’(n), f'(n), --- 
of n are statistically independent (no matter how the additive function f(n) 
be chosen). Consequently,* the function f’(n) + f’(n) + --+ of n has an asymp- 
totic distribution function which is precisely the convolution ¢” * @' * --- 
of the asymptotic distribution functions of the functions f’(n), f'(n), --- of n. 
Accordingly (see footnote 4), if f’(n) + f'(n) + --- is chosen to be a partial 
sum of the series f(n) = > s(n), say the partial sum 

> f?(n), 


p<z 


then 


2 


PRY = py \a —p) 2D S(v')"/p* -(1- (Ls r') ; 

is the square of the standard deviation of the asymptotic distribution function 

of the additive function represented by the partial sum. Let the square of 

the latter standard deviation, which is a function of zx, be denoted by X(z). 
Since A(x) is the multiple sum in the last formula line, it is clear that, if the 

(arbitrary) values f(p") defining the additive function f(n) do not increase too 

rapidly as p — ~,k-> (for instance, if 


(6) | f(p") | < Ck’ 
holds for two sufficiently large constants C, c), then 
-1 2 \\2 
xx) - > E PMP) _ gy py 2) | 
p<z Pp Pp 
tends to a limit as x — « (the last sum being the contribution of k = 1 alone). 
Finally, if the values f(p") belonging to k = 1 are so chosen that f(p) = 1 for 


4Cf. B. Jessen and A. Wintner, Distribution functions and the Riemann zeta function, 
Transactions of the American Mathematical Society, vol. 38(1935), pp. 48-88, more par- 
ticularly, pp. 84-85 and 56-57. 














THE DISTRIBUTION OF PRIMES 429 


every p, then the last sum, >> [ ], reduces to log log x + const. + 0(1) asa— 
(Mertens) .° 

Thus, if (6) is satisfied and f(p) = 1, then A(x) — log log x tends to a finite 
limitasx—» «©. In particular, \(7) ~ loglogz. Consequently, the Poissonian 
formula (5) can now be written in the form 


(7) Sm(x) ~ Ln(x) 


as x — «, where 

a(log x)~'(log log x)” 

(m — 1)! 

Accordingly, what remains to be established is a set of conditions which, in 
conjunction with (6) and f(p) = 1, assure the truth of (7) for the additive func- 
tion f(n) assigned by the values f(p"); it is understood that m is arbitrarily fixed. 


L,(z) = 


6. The Tauberian condition. It will now be proved that such a set of condi- 
tions is represented by the positivity of all the values f(p"). What will actually 
be shown is that (7) holds for every m whenever the conditions 
(Sa) f(p) = 1, (8b) f(p") > 0 (k = 2,3,4,---) 
are satisfied for every prime p so that no limitation of the type (6) will now be 
needed. It is clear that, while (8a) is the homogeneity condition mentioned 
after (4), condition (8b) represents the Tauberian restriction referred to after (5). 

It is instructive that this form of a Tauberian restriction is so essential as to 
become inadequate even in the limiting case, where the inequalities (8b) become 
equalities. In fact, let f(m) be that additive function for which f(p*) is 1 or 0 
according as k = 1 ork > 1 (ef. (8a) and (8b), where p is arbitrary). Clearly, 
f(n) then is the number of the simple prime factors of n so that f(n) is the func- 
tion @(n) defined before (2). Hence, (2) shows that the Poissonian law (7) 
is false in this case (although even (6) is satisfied). The simplest instances of 
additive functions f(n) satisfying all the conditions (8a), (8b), (6) are two 
classical functions,® the first of which represents the number of the distinct 
prime divisors of n and the second the number of all prime divisors of n. In 
fact, it is clear that these two additive functions f(n) are respectively defined 
by f(p") = 1 and f(p') = k, where p and k are arbitrary. In view of these 
examples, it is worth emphasizing that the values f(p*) will not be restricted to 
integers. 

7. Alemma. ‘The proof of the fact that (8a) and (8b) imply (7) can be based 
on a remark which has nothing to do with additive functions, since it can be 
formulated as follows. 


Let 6(n) denote the function defined in (2), and let S be any set of positive integers 
which has the property that there exists a fixed positive integer m = m(S) satisfying 


5 Cf. C. F. Gauss, loc. cit. (see footnote 1), p. 12 and p. 17. 
6 Cf. E. Landau, loc. cit. (see footnote 1). 





430 AUREL WINTNER 


the following conditions: No positive integer n satisfying 0(n) > m is in S and 
a positive integer n satisfying 0(n) = m is or is not in S according as n is or is not 
square-free. Then 
(9) S(xz) ~ Lm(x) 
as x — ~, where L,,(x) ts the same function as in (1) and S(x) denotes the number 
of those elements of S which are less than x. 

The integers n about which it is not assumed whether or not they are in S 
are those integers n for which neither 0(n) > m nor @(n) = m; hence @(n) < m, 


and so these integers n are in one of the m — 1 classes defined by 
6(n) = 1, O(n) = 2,--- , O(n) = m—1. But A(x), (x), --- , Oms(x) denote 
the number of all integers n less than z satisfying 0(n) = 1, O(n) = 2,---, 


6(n) = m — 1 respectively. Since (2) and (1) imply that @,(7) +--+ + 
Om—1(@) = 0(Lm(x)), it follows that the number of those elements of S which are 
less than x and satisfy neither @(n) > m nor @(n) = mis o(L,,(x)). On the other 
hand, an integer n satisfying either @(n) > m or 0(n) = m is supposed to be 
in S if and only if n satisfies @(n) = m and is square-free. Since @(n) denotes 
the number of simple prime factors of n, it follows that the number of those 
elements of S less than x which have not been enumerated by the preceding 
o(L.,(x)) is identical with the number of those positive integers less than x 
which are composed of m simple prime factors. But the latter number is 
1 m(x), by the definition of z,,(x) in (1). Accordingly, S(x) is the sum of 0(L,,(x)) 
and z,(x). Hence, (9) follows from (1). 


8. The proof. In order to deduce (7) from (9), let S be the set of those posi- 
tive integers for which 
(10) S: f(n) = m, 
where f(n) is a fixed additive function and the integer m has a given value. Then 
the definition of f,,(2) at the beginning of this paper shows that (7) is identical 
with (9). Hence, all that remains to be shown is that, if (8a) and (8b) are 
satisfied, the n-set S defined by (10) fulfills the conditions under which (9) has 
been established. 

To this end, let p: = pi(n), po = po(n), --- denote the simple prime factors 
of an arbitrary positive integer n (> 1). Since f(n) is additive, it is clear 
from (8b) that f(n) = f(pi) + f(p2) + --- according as not all or all prime fac- 
tors of n are simple. Since (8a) and the definition of @(n) imply that 
f(p~:) + f(r) + +--+ = O(n), it follows that f(n) > 0(n) or f(n) = @(n) according 
as n is not or is square-free. It follows that no n satisfying 6(n) > m is in the 
set (10) and that an n satisfying @(n) = m is in the set (10) if and only if n is 
square-free. Since these properties are exactly the properties required of S 
in (9), the proof is complete. 


Tue Jouns Hopkins UNIVERSITY. 








nd 
vot 


er 


i- 








PARAMETRIC SOLUTIONS OF CERTAIN DIOPHANTINE EQUATIONS 
By E. T. Bei 
1. Introduction. The complete integer solution of 
(1.1) ny: + -++ + tyr = 0 
is given by the formulas’ 
(1.2) T= aa, == Ya; +> > 04458 i,i+i y 
j= = 


where the Greek letters denote integer parameters, with the convention (as 
always) that a summation (or range of values) in which the lower limit exceeds 


the upper is vacuous. Let f(a, --- ,2n) (¢ = 1, --- ,n) be any functions which 
for integer values of x, --- , Z, take integer values. Then the transformation 
(1.3) Yi Yi — Silt, +++ 5 Xn) (¢=1,---,n) 
takes (1.1) into 
(1.4) > aifilar,-°° » Zn) = > uy, 

i=l i=l 


and the complete integer solution of (1.4) is 


y= aa, 


(1.5) i-1 n—i 
¥= p> ai Bis + DY ai ;Biin; + flac, +++ , ac). 
Fo j=l 


These solutions are valid in any Euclidean ring, as may be seen from the proof 
(see footnote 1) of (1.2). The like, therefore, holds for equations and their 
solutions obtained from (1.1), (1.4) by operating within any given Euclidean 
ring. 

Equations of the types (1.6)—(1.11) are to be considered. 


(1.6) Qyt °° Zi AY’ Yig Fees + Ont %, = O, 


in which a, --- , ad, are constant integers ~ 0 and 7, > 1, 2 > 1,---,7%, > 1. 
This is one possible generalization of (1.1); its complete integer solution proceeds 
from (1.2). 


n n 
(1.7) D> QjTji-*° Li, F(X, a tas » Xja;) _ > AjXji *** Lis Yi, 
i= i= 


Received March 5, 1942. 

1 Th. Skolem, Diophantische Gleichungen, 1938, p. 20. The form of this solution is con- 
siderably simpler than that given by the method of L. Aubry, Réponse é la solution générale 
par identités de l’équation par V. G. Tariste, L’Intermédiare des Mathématiciens, vol. 23 
(1916), pp. 133-134, reproduced in Dickson (see footnote 2), p. 194. 


431 





432 E. T. BELL 


in which the a; are constant integers ~ 0 and the f; are as in (1.4). The com- 
plete integer solution of (1.7) follows from that of (1.6) in the same way that 
(1.5) follows from (1.2), so that (1.7) need not be further discussed. 


(1.8) Q(a1,°°* , In) = BMY Hees + LaYn, 


where Q is the general homogeneous quadratic form in x, , --- , 2, with integer 
coefficients. The complete integer solution of (1.8) is obtained. 


(1.9) O(a, °°*, Sat) = UW, 


where Q is the general homogeneous quadratic form in 2 , «++ , X,—-1 with integer 
coefficients. 

The complete integer solution of (1.9) with n — 1 = 2 was found by Dickson,” 
using the classical theory of binary quadratic forms. Using his generalized 
quaternions, Dickson (see footnote 2, p. 193) found the complete integer solu- 
tion of 


ve — ars — bri + abs? = w 

for certain special values of a, b, and a parametric solution for arbitrary integers 
a, b. Latimer’ obtained more general results for the same equation. In the 
lack of complete solutions for (1.9) with n > 3, it may be of interest to record 
a parametric solution, although there is no reason to suppose that this is the 
complete integer solution for any particular value of n, even if for some n the 
parametric solution of special cases of (1.9) can be identified with complete 
solutions found otherwise. 


(1.10) Q(t, +++, Xn) = yt + +++ + AndnYn, 


where Q is as in (1.8) and a, --- , @, are constant integers different from zero. 


(1.11) Q(x, +++, 2n-1) = auy, 


which differs from (1.9) only by the constant coefficient a ¥ 0. 


2. Equation (1.6). The solution of an equation of this type is reduced to that 
of an equation of type (1.1) and an associated multiplicative system. The first 


2L. E. Dickson, Modern Elementary Theory of Numbers, University of Chicago, 1939, 
p. 190. 

3C. G. Latimer’s extensions are summarized in Dickson (see footnote 2, p. 193). 

4 The parametric solution of (1.9) is obtained by determining certain of the parameters 
in the complete integer solution of another equation, say Z, so that some of the indeter- 
minates vanish. By suitable choice of the coefficients, Z then degenerates to (1.9). Si- 
multaneously the complete integer solution of E degenerates to a parametric solution of 
(1.9). There are an infinity of equations which degenerate in this way to (1.9), and each 
furnishes a parametric solution of (1.9). If the entire class of equations which degenerate 
to a given equation could be defined, something might be inferred about the completeness 


of the degenerated solution. But the class appears to be undefinable constructively. 








m- 
at 


rer 











PARAMETRIC SOLUTIONS OF CERTAIN DIOPHANTINE EQUATIONS 433 


term in (1.6) is written a)7; --- 2,1 X 2;,, and similarly for all. Then, by 
(1.1), (1.2), 

QyX, °** Lis = Ad, Q2Yi°** Yi = Aa, pias On *** 2,1 = Ad, 
the associated multiplicative system, to be solved for the 2, y, --- , z and the a. 
The solution is given non-tentatively by either of two methods (see footnote 1, 
Skolem, Kap. 4) and the values thus found for a , ---+ , a, are then substituted 
in the expressions for x;, , yi, ,*** , 2:, Obtained from the second set of equations 
in (1.2) by writing z;, , yi, , -*+ , 2:, for y1 , Yo, --* , Yn respectively. 


The complete solution falls into classes of solutions, each class containing a 
single representative, according to the divisors of the constant coefficients 


a, °**,@, in (1.6). We discuss only the case in which the Euclidean ring 
concerned is that of the rational integers.” 
Let pi , Po, Ps, °** be the primes 2, 3, 5, --- in ascending order. The positive 


integer m may be written m = p;*, where the repeated Greek suffix indicates a 
product convention (analagous to the sum convention in tensors) over 


¢ = 1, 2,3,--- ; thus pz = [J p*. If p, is the greatest prime dividing m, 
s=1 


m, = Ofors >t. 
In solving (1.6), the constants a; are temporarily replaced by distinct inde- 
terminates, a; — u; , and the resulting equation, with all coefficients 1, is solved. 


Let 6, ---, 96, be all those parameters in the solution that appear in the 
parametric expressions of the u;, so that u; = 6{'' --- 6°", where ca, +++ , Cir 
are integers = 0. Hence 

(2.1) a; = Of! --+ ofr (¢=1,---,n), 
and at least one of 9, --- , 6, is a divisor of some a;. All sets (@,--- , 6) 
satisfying (2.1) may be found (when a, , --- , @, are any given integers) by solv- 


ing a set (2.2) of linear Diophantine equations in non-negative integers. From 
(2.1), 


pet =, (pe'*)"* +++ (pert) 
a. Satellite (i =1, +++, n), 
and, therefore, 
(2.2) CaOig +--+ + Cid = Gig ((@=1,---,n;§=1,---,d, 
in which a, and the c;; 2 0 are given constant integers and p, is the greatest 


prime dividing a, ---a,. A solution (0, -+- , 02) = (Oe, °°; 6;:) of (2.2) 
determines a set of the constants appearing as coefficients in the complete solu- 
tion of (1.6), which is separated into classes according to the solutions of (2.2). 
The determination of the number of classes is a solvable problem in compound 
partitions. 


5 A slight modification of the device employed for this case takes care of the general 


Euclidean ring. 





434 E. T. BELL 


3. Equation (1.8). The solution of this equation is a preliminary to that of 
(1.9). The general n-ary homogeneous quadratic form with integer coefficients is 


n n—l n—-i 
2 
Q(ai,--+ 2) = Decwtit Dy Dd cisterns, 
i=l t=1 j=l 


and the general linear transformation with integer coefficients which takes 
Lyi + +++ + LaYn into this is 


i—1 n—t 
Yi Dy (eis — via) t HCE Do vices Tins, 
j=l j=l 


where the y’s are 4n(n — 1) independent integer parameters. From (1.3) and 
(1.4), it follows that the complete integer solution of (1.8) is 


i—1 n—t 
¥i= Z a;[a(ej,i — yi.) — Bia] + Cicaai + ys i+ j(@Yi,i+7 + Biers), 
j=l j=1 
Yyi= aa; i= 1, ---,n). 


There are 4n(n — 1) parameters 8;,; and n + 1 parameters a, a;. Hence 
the total number of parameters in the solution is n” + 1. 


4. Equation (1.9). With the notation as in §3, andi = 1, ---, n, write 
61,5 = a(Cjs — Vi8) — Bia Gj = 1,---,#— 0), 


6:5 = aCj,i, 
bits = OViigs + Bigs G=1,---,2— 4). 
Then 
yi = Do bia, 
j=l 
and y, = 0 (s = 1, --- ,m — 1) is the system 
(4.1) > 8.,;0; = 0 (s = 1,+++,n—1) 
j=l 
of n — 1 homogeneous linear equations in a, ---,a, with the (n— 1) Xn 
matrix (4,,;). Denote by (6,,;); the determinant of order n — 1 obtained from 
(6,,;) by deleting the i-th column of (6,,;), and let 6 be the G.C.D. of 
(d..j1, °°: » (s,)n- Let A denote an integer parameter, and write u = \/8. 
Then the complete integer solution of (4.1) is 
(4.2) ay = (— 1) "n(ds,5)s (§ = 1,---,n). 


The values (4.2) of a; substituted into the solution in §3, with Q as there, give 
a parametric solution of 


Q(x "Beto » Tn) = TnYn 








V 


Vv 


—_—_> ~*~ i 


of 
sis 


nd 


ce 


), 


n 


f 








PARAMETRIC SOLUTIONS OF CERTAIN DIOPHANTINE EQUATIONS 435 


as polynomials with integer coefficients in n’ — n + 2 integer parameters. If 
in this equation and its solution all c,,, in which at least one of r, s is n are set 
equal to zero, and the notation z, , y, is changed to u, v, we get a parametric 
solution of (1.9). If D, is the determinant of y , --- , y, considered as linear 
forms in a, ---,a,, and D*, is the determinant obtained from D,, by setting 
Cin = Cn = 0 (i = 1,---,n), the value of v is (— 1)” "uD. 


5. Equations (1.10), (1.11). The form Q is as in §3; the first equation to be 
considered is 
(1.10) Q(x ee Ln) = xy tees + AnLnYn 


with a, --- a, # 0. The method for (1.10) is an immediate extension of that 
for (1.9) in §3. 
Necessary and sufficient conditions that y; — y;, where 


- P n ; 

(5.1) ¥ = 2D Yes (i= 1,-++,n) 
, 

with integer coefficients y;,;, take pm avy; into Q(x, +--+ ,2n) are 

(5.2) QiYii = Ci, QiVi,§ FOV 5,4 = Ci, 5 (¢ <j). 


It will be assumed that these conditions are satisfied. Hence, if a;,;is the G.C.D. 
of a; , a;, it is necessary that a; | c;,; and a;,;| c;,;, so that 


, / , , 
Cini = AC, a; = 4,;, a; = 4;,;, Cig = AU, 5,5, 


° / / , . . . 
where all the letters denote integers and a; , a; , c;,;are coprime in pairs. Hence 
, 


vYi« = (4. The general solution of the second equation (5.2) is 
, , / , / , . o 
Vig = Ci GVing + OjX,5, V5.4 = Ce,5V7,8 — BX, 3 («< 9), 
where Vids Yi is any solution of 
7? , 
Q7i.5 + 4j7j.4 = 1, 
and ),;,; is an integer parameter. From (5.1) we have 
i—l n—t 
, , / , , , 
y= >» (Cj,6 Vig — Oj Aja) He + Ci B+ ba (Ci,c45 Vises HH e4j Aicaj) Lig; 
j=l j=1 
Hence (1.10) may be written 
(5.3) anys =0, vi =H- vi, 
» i=l 
which is of type (1.6), the associated multiplicative system being 
(5.4) ati = aa; (2 = 1, inet n). 


From the solution 2x; , y; of (5.3), the solution of (1.10) is written down in an 
obvious way. From this a parametric solution of (1.11) is obtained as in §4 


for (1.9) from (1.8). 


CALIFORNIA INSTITUTE OF TECHNOLOGY. 





THE DOUBLE-N, CONFIGURATION 


By Artuur B. CoBLe 


1. Introduction. We are concerned initially only with the double-N,, con- 
figuration defined by a White [7] surface, a configuration in the linear space [n + 1] 
which consists of NV, lines, l; , and of N, spaces [n — 1], \;, such that 1; and \; 
are incident if and only if 7 # j (7,7 = 1, --- ,Nn), where N, is the binomial 
coefficient (n + 2; 2). In conelusion, however, we raise the question as to 
whether there may not well be configurations of this sort more general than 
those which define, and are defined by, a White surface. 

The White surface, IV, , in [n + 1] is the map of the plane by the linear system 
of curves of order n + 1 on a generic set, P5. , of points pi, --- , py, of the plane. 
It has the order (n + 1)° — N, = Ny-1. The directions about a point p; , 
say p; , map into points on a line J; of the configuration CW, under discussion. 
Let C; be the curve of order n on all of the points of Py, except p;. Since 
Px. is generic, the curves C; are all distinct and generic. A particular curve 
C; maps into a curve k; in an [n — 1] of order N,_2 which crosses each of the 
lines 1; (¢ # j), sinee C; goes through p; with some definite direction. Let 
\; be the [mn — 1] in which k; lies. Then ; also cuts 1; if i #7. We thus obtain 
from Py, the double-N, configuration of the type we will call CW, . The 
configuration itself, apart from the W,, which defines it, is formally self-dual. 
The lines and [n — 1]’s are dual in [mn + 1]. The non-incidence of 1; and ); 
imply that thev have neither a [0] nor a [n] in common. The incidence of /; 
and \; (j # 7) imply that they have a point m;; in common and a prime 4;; 
in common. 

The first instance for n = 1, N, = 3 is the figure of three lines 1; , 1, , Js in 
the plane W, and three points \; , Az, As , each point on two of the lines, i.e., a 
plane triangle. The mapping mentioned above by conics on P}; is a quadratic 
transformation from the plane of Pj to W,. In this case the CW, has no 
geometric interest. 

The second instance for n = 2, N, = 6is the figure of six skew lines 1, , --- , Is 
on a cubic surface W> in [3] and the six lines \; , -+- , As, which with J, , --- , Is 
form a “‘double-six’”’ on the surface. This figure has two significant properties 
for which we use the terms descriptively self-dual and intrinsically self-dual 
with the following meanings. A formally self-dual configuration is descrip- 
tively self-dual if for every figure constructed from its parts there exists a dual 
figure dually constructed from its dual parts. A descriptively self-dual con- 
figuration is intrinsically self-dual if there exists a correlation which transforms 
each part into its dual part. Naturally the formal, descriptive, and intrinsic 


Received March 24, 1942. This and the following article present material which was 
reported in a retiring address delivered at the Dallas meeting of the American Association 
for the Advancement of Science. 


436 

















THE DOUBLE-N, CONFIGURATION 437 


self-duality are successive stages of restriction each implying the existence of 
the preceding stages. The CW, is an example of intrinsic self-duality. An 
instance of its descriptive self-duality is the following dual theorem: The points 


[primes] of the lines 1, , --- ,ls [Ar ,+--,As] are on a unique cubic surface 
W. fenvelope W:,] which also contains the points [primes] of the lines 
Mi, ++: ,Ae (li, ---,&]. Furthermore the existence of the Schur [6; 5] quadric 


attached to the double-six, in which the lines /; , \; are pole and polar, shows that 
CW is also intrinsically self-dual. It is clear of course that the correlation which 
exists in the case of an intrinsically self-dual configuration C is unique and 
involutorial unless C itself is so special as to admit collineations. 

It is the purpose of this article to show that, for n > 2, the CW, is not deserip- 
tively self-dual for generic Py,; that when descriptive self-duality is restored, 
in some measure at least, by restricting Py, , then intrinsic self-duality does 
not yet exist; and that finally there exists a class of sets Py, with 2n + 2 absolute 
constants for which the CW,, is intrinsically self-dual. That CW,, for generic 
P%,, is not intrinsically self-dual is proved by Room [5;77]. 

One is always inclined to accept as inevitable that attenuation of geometric 
properties which accompanies the process of generalization to spaces of higher 
dimension. The basis of the attenuation is the greater variety of possibilities 
in the higher dimension. Thus a planar conic generalizes into either a cubic 
space curve or into a quadric surface and the properties of the conic are dispersed 
over these two diverse figures. It is interesting then to find instances of the 
fact that all the salient features of a figure may be preserved under generalization 
provided the generalization is followed by properly chosen specialization. 


2. Non-duality of the generic CW,,._ We wish to prove the theorem: 


(1) For n = 3 the CW, derived from a generic set Px, is not a descriptively self- 
dual figure. 


For this purpose we have to show that CW,, defines projectively a figure such 
that the dual figure dually defined does not exist. Let Px, be generic, and let 
W, be the White surface of points obtained by mapping the plane upon W,, 
as described above. We consider first the prime y;; on the incident line, J; , 
and [n — 1],A;. This prime corresponds to a curve H;; of the mapping system 
which has a node at p; and the factor C ; , since u;; cuts W, in 1; and the curve k; . 
Hence H;; is the product of the line (p,p;) and C;, and the prime y;; cuts W,, 
in l;, k;, and a further residual rational norm-curve N;; of order n — 1, the 
map of the line (p;p;). In the pencil 7; of primes on \;, the N, — 1 primes 
ui (t # J) are projective to the VN, — 1 lines (p;p;) of the pencil on p;, since the 
curves H;; have the fixed factor C; and the variable factor (pip;). Hence 


(2) The N»(Nn — 1) primes yi; are arranged in N,, pencils x; which projectively 
are the pencils of a ternary N-point, Py, . 





438 ARTHUR B. COBLE 


The algebraic conditions that pencils 7; , 72 , 73 be those of a ternary n-point 
are of the following simple form [1; 195, (14)]: 


(3) D(1; 23, 45)-D(2; 31, 45)-D(3; 12, 45) = 1, 


where D(1; 23, 45) is the double ratio of pyoyi3 , prams - 

The dual of the primes u;; are the points m;;, where the line /; is met by A;. 
These points m;; are also distributed in N, pencils p; on the lines 1;. With 
respect to these we prove the theorem that 
(4) The N, pencils p; of points m;; on lines l; of CW,, are not projective to the 
pencils of a ternary N,-point on when n > 2, and when the Px, which defines 
CW, ts generic. 

The proof of this theorem carries with it the proof of (1) since (2) asserts 
the existence of a projective property and (4) asserts the non-existence of the 
dual property. 

The point m;;on W, isonl;andk;. It is therefore the map of the direction 
at p;onC;. Thus the pencil p; of points m;; is projective to the pencil of tan- 
gents to curves C; at p;. 

For n = 2 the opposite of (4) is correct, Py. being the same as 5. . Indeed, 
in this case the mapping is carried out by cubic curves on Pi. There is a 
pencil of such curves, nodal at p;, and the pairs of nodal tangents are in an 
involution. Included among these nodal curves are the degenerate members 
(pip;)-C; whose tangents are (p,;p,;) and the tangent to C; at p;. Thus the 
pencil x; on \; and the pencil p; on 1; are projective in such wise that y»;; cor- 
responds to m;;. This indeed is a consequence of the existence of the Schur 
quadric. This polarity interchanges /; , \; and 1; , \;, and therefore also inter- 
changes m;; with yj . 

We next examine n = 3 to see that (4) is valid in that case. We remark pro- 
visionally that, if the pencils concerned belong to a P;3 for generic Pio , they 
must also do so for a particular P?,. For, the necessary double-ratio relations 
will continue to exist as the points of Pip change. Let, then, p,, --- , ps be 
generically chosen, and let pz, --- , Pwo be on a line &. The cubics C,, --- , Cs 
must contain the factor — with a residual factor which is a conic ¢ , «-+ , cs on 
the five of these six points other than p,, --- , ps respectively. Then, as for 
n = 2, the tangents to C,,--- ,Cs at p: are those of @,--- ,cs at pi, and 
they are projective to the lines from 7 to ps, --- , ps under an involution J, . 
Thus, however the remaining four points of the desired P{; may lie, we may 
take the first six points of Pjj at pi,---,ps. Let &”, &”, & be lines on 
Pi , P2, Ps respectively which do not meet in a point; let ¢’, ¢’, &” be their 
partners in the involutions J, , I, , Iz; respectively. Let Cz be the cubic curve 
on ~i,-**, ps which has at p;, po, ps the respective tangents &”’, ¢°’, €°’; 
let — be an arbitrary line which meets C7 in pgs , py , pio ; and let p; be a further 
arbitrary point on ~ We seek to construct the point p; of the desired PS. 
For this we add the tangents ¢”’, ¢®’, ®’ to the pencils at p:, p2, p3 ; and we 











it 








THE DOUBLE-N,, CONFIGURATION 439 


apply the involutions J, , J; , J; respectively to them so as to obtain the first 
six points of Pi} at i, °**, ps6, and thus get lines e? ¢®  ¢® which do not 
meet in the required point Pp; of the required Pi. Since Pi? does not exist 
for this particular Pjy , it does not exist for the generic Pip . 

We complete the proof of (4) by showing that, if (4) is true for n — 1, it is 
also true for n. Let Px, be chosen so that it is composed first of a generic set 
Pe and a set of n + 1 remaining points on a line &. Then C;, --- , Cy,_, 
have a common factor é and residual factors ¢ , --- , ¢y,_, , Which are the cor- 
responding curves for the set Py,_,. If the tangents to the curves C of Px, 
define a ternary set Py. , it is necessary that the tangents to the curves c of 
Px,,-, define a set Py_, which is a part of Py. Thus (4) can be true for n 
only if it is true for n — 1. 

If we take account of the possibility of mapping the points of a plane upon 
either the points, or the primes, of [n + 1], we may state the above results as 
follows. 


(5) For the generic CW, with N,, lines l and N,, spaces i, there must exist a White 
surface W,, which contains the points on the lines l, or else there must exist a White 
envelope W,, which contains the primes on the spaces d, but W, and W,, do not co- 
exist. 


Because of this curious lack of symmetry in the White double-N configuration 
we raise the question in the final section as to whether configurations may 
exist which have neither a W, nora W,,. In the next section we find a special 
class of eases for which both W, and W,, exist. 


3. Configurations CW, defined by a surface W, and an envelope W,. It is 
convenient to introduce the W,, analytically in terms of the trilinear form [4] 


T(2, n,n + 1) = (ax)(By)(yz) = Doss Di je LLY 52% 


(1) 

(= 0,1,2;7 = 0,--- ,n;k =0,---,n+1), 
with £, n, ¢ as contragredient coordinates in the spaces [2], [n], [n + 1] of x, y, z 
respectively. If z on W, is the map of x” on [2], the equation of z“” in [n + 1] is 


(0) (ax) B; Yk 
(2) (2 9) = =0 


Sk 
For each section, Q“"~'(¢), of W,, by a given prime ¢, this equation (2) yields the 
corresponding curve, Q"*'(¢), of the mapping system on Py. This set of 
points p, is obtained from the pairs, x, y = pr, qa, Which are neutral for z in 
T = 0, i.e., 


(3) (apr) (Bqave = 0. 








440 ARTHUR B. COBLE 


These neutral pairs are defined independently of z by the double identity (cf. 


[4; §5, (4)]): 
(4) Don (gan) (pat)" = 0 (h = 1, +++, N,). 


If in T = O we restrict z to the prime ({z) = 0, then T = O be- 
comes a 7”(2, n, n) = O with spaces [z], [y], [2’] = ¢. Then the curve Q”*’(¢) 
in (2) is the locus of points x“”’ for which the bilinear form in y, z’ is singular. 


(0) (0) 
s 


For given x” on this curve, the two singular points y“”’, z“” are furnished by the 


equation 
(ax Baye 13 
(5) be 0 = (2%%)-(yn). 
tb 0 


Thus x on Q"*?(¢) is mapped upon z on Q**-'(¢), and upon y® on S**-"(¢) 
by means of two complete linear series g¥"~', g.*"-' which are residual to each 
other in the linear series cut out on Q"*'(¢) by all curves of order n. We call 
these two paired curves, Q*""'(¢), S*"-'(¢), “Reye curves”. The equivalences 


9 


which define them projectively for given Q"”'(¢) on Py, are as follows: 
(6) Py, tg = (n+ DL, go*' + 9,"" = nL, 


where L is a line section of Q"*'(¢). In the mapping (5), the points x, of P5. 
pass into the points z, on W,,, where ¢ cuts the lines 1, of CW,,, and, in the space 
[y], into the points q of (3) and (4), a set which we call Qy, . 

A pair of Reye curves are not in general projective to each other in such wise 
that y corresponds to z°. When this happens, and when the space [n] of 
¢ in 7’(2, n, n) is projected upon the space [y] so that z falls on y’, the form 
T’(2, n, n) reduces to the polarized form (a’x)(@’y)?. Then either Reye 
curve is a “Jacobian curve”, the locus of nodes of quadrics of a net. The 
generic 7’(2, n, n) has 3(n + 1)? — 9 — 2n(n + 2) = (n + 1) — 7 absolute 
constants; the net of quadrics has 3N, — 9 — n(n + 2) absolute constants. 
Thus it is V,_2 conditions that a Reye curve becomes a Jacobian curve. 

For the Jacobian curve, the linear series g\""', ge" on Q"*'(¢) coincide into 
a contact linear series. Since the generic Q"*'(¢) has the genus N,_2 , there are 
« “=~? distinct series g\"~', whereas there are only a finite number of con- 
tact gX""'’s. This confirms the above number, N,-2, of conditions for a 
Jacobian curve. In the contact case the equivalences (6) yield 2g%""' = nL, 
oP. = (n + 2)L. Moreover, this last equivalence, in combination with (6), 
which define g*"~' and g,,""~', respectively, yields again g¥"-' = g,."*-'. Hence, 


we have the following. 


(7) If Px on Qs) satisfies the equivalence 2P%,. = (n + 2)L, then the Reye 
curves, Q”*"*(¢), S**-*(¢), are projectively equivalent Jacobian curves. 





y (cf. 


N,). 


) be- 
“Oo 
ular. 
vy the 


“*(¢) 
each 

call 
neces 


9 


Py, 
ace 


wise 
] of 
orm 
eve 
The 
lute 
nts. 











THE DOUBLE-N,, CONFIGURATION 441 


For given Px,, there are only «"”*’ incident curves Q”**(¢). Thus the 
N,_2 conditions that oP; = (n + 2)L cannot be satisfied if n > 3; they can 
be satisfied in ©* ways if n = 3; and in «* waysifn = 2. We have, therefore, 
the theorem: 


(8) The generic White surface W, has no prime sections which are Jacobian curves 
ifn = 4; and it has only ~* such sections if n = 3. However, when n = 2 and 
Ws is a cubic surface with isolated double-six, there will be ~°* sections ¢ of We for 
which the cubic curves Q*(¢) on P§ are such that 2P§ = 4L, and these sections ¢ are 
the cubic envelope Ws which has the same double-six, CW> . 


This special situation in connection with W, occurs also in connection with W, 
if the set Px, is the set of nodes of arational plane curve, p; **(t). We recall [2; §6] 
that if, on a plane [x’], the point x’ is determined as the intersection of tangents 
t, , 2 of a norm-conic, A(t), then the pairs of nodal parameters t,, , ta, at the nodes 
pr of pz **(t) determine in [z’] the “nodular” points p, of a nodular set P. ‘ 
We recall also that in addition to the trilinear form 7(2, n, n + 1) used above 
with pairs x, y = pa, gx neutral for z, there now exists also a second trilinear form 
T’ (2, n, n + 1) in variables x’, y, z’ with pairs p, , q, neutral for 2’. For generic 
rational curve the nodal set Py, and the nodular set PX. are not projectively 
equivalent. We prove the following theorem which includes the case W2 above. 


(9) If Px, is the set of nodes of a generic rational planar curve pz ~*(t) [3n absolute 
constants], and if the plane is mapped on the points of a White surface W,, in 
[n + 1], then p3**(t) is mapped on a rational norm-curve N"~*(t) bisecant to the 
lines 1; of the CW, . The ~” collinear (n + 1)-points of pz **(t) are mapped on 
(n + 1)-points of N"** on primes ¢ which cut W, in Jacobian curves. The ~° 
primes thus obtained lie on a White envelope W, which has the same CW, as W,, . 
The primes of W,, are the map of points x’ by curves of order n + 1 on the nodular 
set PP. ‘ 

We observe first that the existence of N"*'(¢) on W, is, because of the mapping, 
an obvious consequence of the existence of the irreducible rational curve pz “*(t) 


with nodes at FP. . Let the line £ cut pz **(t) in the points t , to, ts, --+ , tnas- 
On the n + 1 points ¢;, --- , tn43 Of pz ~°(t) there is one, and only one, curve 


Q"*'(¢), since p: **(t) is irreducible. Let L be the set of these n + 1 points on 
Q”"*(¢). Using on Q"*'(¢) the equivalence 2P,, + L = (n + 3)L, obtained 
from p: **(t), we obtain the equivalence 2Py, = (n + 2)L, which, according to 
(7), implies that the section of W,, by ¢, Q”"-'(¢), as well as S""-"(¢), is a Jacobian 
curve. We have then only to show that the primes ¢ so obtained are those of 
a White envelope W,, obtained by mapping from the plane [x’] by using Py, . 

We now introduce specific mappings to prove the remaining statements. We 
had obtained in [2; §9, (8)] from 7'(2, n, n + 1) a form, 


H(z"*", fr" 


10 
00) a lay" + (” + ') H,(2)t" + (” i. *) H,(z)t" + «+» + Heul2), 


1 2 





442 ARTHUR B. COBLE 


whose polarized form H(x"*", t, 2, ---,t"™’) = 0 is the (n + 1)-ie curve on 
the nodes Px, of p: **(t) and on the further points of p;**(t) with parameters 
t= t,t, +--+ ,tns1. Thus the mapping of [z] on W,, is accomplished by setting 
(11) W.:%= Hz), a=Hi(z), w= Helz), «++, ear = Aaas(z). 
Furthermore, when < is at the point ¢ = s of p: **(t), then H(x"*', t"*’) becomes 
the perfect power (ts)"*' = (t — s)"**. Hence the curve N"**(t) on 
W,, is given by 

(12) N°“: a=1, a=-—-t a=, ‘wes | ee ee Cerra, 


There is also a well-known symmetric involutorial form, 
(13) (5it:)"** (Sate) **(5t)"** on 0, 


° ° n +3 ° ° 
which expresses that the points of pz “’(¢) with parameters ¢, , t2 , ¢ are collinear. 
If in this we replace the symmetric combinations of ¢, , fg by coordinates x’ 
referred to A(t), it becomes a form 


(14) D(a’"**, t”*) = Do(a’)t"** + Dy(a’)t” + Do(x’)t”™ + +++ + Dags(z’) = 0, 


which furnishes for given x’ = x’(t,, &) the parameters of the n + 1 further 
intersections with p: **(t) of the line joining the points 4, t of p:*(t). If 
t, , fe is the pair ty, , fe, of nodal parameters at the node p, , then the (n + 1)-ic 
in ¢ of (13), (14) vanishes identically, or the curves Do(x’) = 0, --+ , Dasi(x’) = 0 
are on the nodular set PS. Assuming for the moment that the curves 
D,(x’) = 0 are linearly independent, a fact which will appear presently, we can 
use them to map the points x’ upon the primes of W,, by the relations: 


(15) Wa: f = Dalz’), oi = -—D,(z’), rey Snaa = (—1)""Dol(z’), 


If D(x’"*’, t"*) = 0 has roots t = t;, --- , tn43, the parameters of the n + 1 
further intersections of the line joining the points t,, tf of p:**(é), and if 
H(x"*"; ts, ts, «-* , tna3) = O is the (n + 1)-ic curve on PX, which cuts pz **(t) 
in the same points, then the apolarity condition of D(2’"*’, t"*’) and H(2"™", t"**) 


is the incidence condition of z(x) on W, with ¢(x’) on W,, , namely (ef. (11), (14)), 
(16) Z0Fo + 21h + = + Zn4ifn 3 = Ho(x) *Dasi(x’) = H,(x) -D,(2’) ot coco = 0. 


Let t , f approach ty , f, in such a manner that the ratio (4, — h)/(ta — t) 
approaches the value 6. Then the point 2’(t, , 2) approaches the nodular point 
Pr in a certain direction, and the line joining t, , tf of p: **(é) is approaching a 
line — on the node p,. The limiting position of the curve (16) which always 
passes through p, is —-C,. Thus the primes of W,, which correspond to direc- 
tions 6 about p, are the primes on the [n — 1], \, of the CW,. Thus CW, 
and CW,, are identical. Moreover, these primes are enough to exhaust the space 
[n + 1] of ¢ in (15) so that the curves D in (15) are linearly independent. 

The case n = 2 for which there are ~” rational curves p2(t) with nodes at P§ 
deserves some notice. Of the «©* cubic curves Q*(¢) on P; there are ©” curves 








2 
6 


eS 











THE DOUBLE-N,, CONFIGURATION 443 


Q’(¢) on which the equivalence 2P; = 4L exists. These contribute the planes ¢ 
of the envelope W. which has the same double-six as the cubic surface W2. 
The peculiarity is that any one of the ~° collinear triads of a cubic Q*(¢) is also 
a collinear triad of some one of the «* rational curves p2(é). These collinear 
triads of Q*(¢) map into triads on the section of W2 by ¢ which are the contacts 
of a system of contact conics. For, the curves p2(f) map into cubic curves 
bisecant to the lines 1; ; the lines — map into cubic curves bisecant to the lines ); . 
A cubic curve of one system meets a cubic curve of the other in five points, 
2, ---, 2, and the two curves are on a quadric g which touches W, at these 
five points. Then the plane (2, z, z) is a plane ¢ of We which cuts W. 
in a cubic curve, and q in a conic, which touch at 2, 2, z. Since the ra- 
tional curves p2(¢) have perspective conics, the case n = 2 will also appear in 
the next section. 

The case n = 3 with Pip nodal for a single ps(t) is also somewhat peculiar. 
For, then (compare [2; §8]) there is also a rational curve p2(r) with nodes 
at the nodular set P;};. Thus, in addition to the curve N*(t) on W; bisecant to 
the lines |, of CW;, there is a dual envelope N*(r) on W; with the planes A, 
of CW; as axes. This appears to be one instance of descriptive self-duality 
without the accompaniment of intrinsic self-duality though it is not possible 
to be categorical with respect to descriptive self-duality without examining all 
the associated loci of the figure to make sure their duals exist. 

For cases n > 3 it does not seem likely that the existence of N"**(t) on W,, 
implies the existence of a corresponding N"~*'(r) on W,, so that the descriptive 
self-duality, restored for the points m;; of CW, as contrasted with the primes 
ui; of CW, by assuming that Py, is nodal, would fail to persist under a more 
searching examination. In the next paragraph we find a Px, with 2n + 2 
absolute constants for which CW, is intrinsically self-dual, being unaltered by 
a polarity. 


4. Intrinsically self-dual configurations,CW,.. Two rational planar envelopes 
r3(t), rz *° “(t) of class k and n + 3 — k respectively, with lines in (1, 1) cor- 
respondence through like-named parameters ¢, “generate” a rational planar 
point-locus p; **(t), unless they are specially situated in that some corresponding 
lines coincide throughout. Conversely, a given p: °(t) can be generated in this 
way by means of curves, r2(t) and rz **“*(t), “perspective” to it. We exclude 
the case k = 1 which implies that all of the nodes of p; **(t) coincide at an (n + 2)- 
fold point, this being a perspective point, r3(t). 

We consider in this section the curve p?**(t) with a perspective conic, K(t). 
If ¢(t) is a tangent of K(é), then é is a perspective line of K(¢), and the tangents 
of K(t) set up a (1, 1) correspondence between the points of — and of pz “an. 
Since, conversely, the class of the locus of lines joining corresponding points 


is only two, n + 2 of the corresponding pairs must coincide. Hence 


(1) The n — 1 conditions that a line & meet p;**(t) in n + 2 points whose param- 
eters on the line and on p;**(t) are projective are poristic for the line £. Indeed, 





444 ARTHUR B. COBLE 
n — 2 of these conditions fall on pz **(t) and imply that pz **(t) has a perspective 
conic K(t), and the remaining condition requires that — be a tangent of K(t). 


Thus the rational curve p: **(é) under consideration, and also its nodal set 
Px, , has 3n — (n — 2) = 2n + 2 absolute constants. 

With a; and b; linear forms in z, let pz **(é) be generated by its perspective 
conic K(t) and a perspective envelope of class n + 1, given by 


Aol? aa ant + a = 0, 
bot” *? + bit” + det + ees + Dag = 0. 


The equation in variables x of p; **(t) is the Sylvester resultant, 


ao a, ae 0 2+) 0 
0 ao ay a2 oes 0 
(3) R= Ras = «i* 0. 
bo by be bs wedi 0 
0 bo by be eee Daas 


The subscripts in R,4:2 indicate the number of rows of coefficients a, and of 
coefficients b, in R. We denote also by Ry41~j2-; (7 = 0, 1, 2) the matrix ob- 
tained from R by dropping the last 7 rows of coefficients a, the last 7 rows of 
coefficients b, and the last 7 columns. 

Two binary forms of orders m, u have a unique apolar (m + yw — 2)-ic. This, 
formed for the binary forms in (2), is the (n + 1)-ie: 


Raa 


H(2"",0"") = n+1\, (n+1 
? 2 vs n+1 yn+1 


= Haye + ("Fae + + + Halo 


If x is at the point x(t’) on p; **(d), the forms (2) have a common root ¢’ and the 
apolar form is (t’)"*'. If x is at a node p, of p:**(t), the forms (2) have two 
common roots and the apolar form vanishes identically. Thus H is the form of 
(10) of §3 and H(x"*, t,, --+ , tna) = Ois the adjoint of order n + 1 of p2 “*(t) 
which cuts p: **(é) in further points t = t, +++ , tru. 

If all the adjoints on z and 2’ also go through x”, then all the binary (n + 1)-ics 
apolar to H(x"*', ¢"*") and H(x’"*", t"*") are also apolar to H(x’’"*’, t"*’). 
Let x” be the point 2’’(t; , &), the join of tangents ¢, and ¢ of K(#). Then the 
(n + 1)-ies apolar to H(a’’"**, t”**) include (aot” + ait + a2)2+» = (tt:)(tt2) taken 
with an arbitrary (n — 1)-ic, (at)". This same system of (n + 1)-ies is apolar 

















THE DOUBLE-N, CONFIGURATION 445 


to H(a"™*, t"*™), H(a’"**, t"*") = (tt)"™, (t)"™ when z, x’ are at the points 
- to of po *(0), Hence 

(5) The linear system, H(x"*', tytet"*) = 0, of adjoints of p:**(t) on the points 
ty , te of pz **(t), a system (~"") for variable t, has a further base point at the point 
x(t; , te) of intersection of tangents t; , te of K(t). 


An immediate corollary of this is: 


(6) The linear system, H(z", th, ,°*', tt”) = 0 of adjoints of pz **(t) 
on the points t,---,t of p: *(t), a system («"“**) for variable t, has 


(5) further base points at the intersections of the k tangents t,,---,t. of 


K(t) (k = 2,---,n +1). Thus to pass through the (5) intersections of such a 


circumscribed k-line of K(t) imposes only k linear conditions on the adjoints of 
n+3 . > ° 
p2 (t) when k > 2, and the linear system so defined has k further base points 
n+3 
on po (t). 
We now prove a theorem particularly important for our purpose, namely: 


(7) In the linear system (~"*") of adjoints of p:**(t) of order n + 1, there is a 
linear system (2' ' of adjoints with a node at p, , a node of pz *8(#). In this latter 
system there is a linear system («"*) of adjoints with a triple point at pp . 


Let tn, ts, be the parameters of the node p, on p;~*(t). Consider the be- 
havior of the linear system («”~*) of adjoints H(x"™’, 4, &, t”") = Oas ht, te 
approach t,, , f,. Since the adjoints are already on p, and pass through 4; , t 
on p; ’(t), then, as ¢; approaches ty, the adjoint touches the branch ty, of pz **(#) 
at p,. If in addition ¢, approaches ¢., on the other branch, the adjoint must 
acquire a node and this node alone takes care of the contacts. Hence the linear 
system (%""), H(x"*", tur, tea, t” ’) = 0, is the system of adjoints nodal at pp . 
Consider again the system consisting of the adjoints 


H(z"", tu, ts, h, &*) = 0. 


According to (6) this system has base points ty, , fa, , 4; on p2 *’(¢) and the further 
base points, x(t, fer) = pa, t(ta, &), x(ten, th). Thus the three extra inter- 
sections at p, of the adjoints nodal at p, are accounted for and two outside base 
points appear. Suppose now that ¢ approaches t4,. The point x(t, t) 
approaches the direction at p, on the tangent t of K(t). The point x(ta, t) 
approaches the point ¢,, of K(t). Thus we have: 
(8) The linear system (#"~*) of adjoints H(x"*", tin, ten, t” °) = 0 is nodal at pp 
with nodal tangents consisting of the tangent to p;**(t) at p, on the branch ty, and 
of the tangent t», of K(t). The system has a further base point at the point ty 
of K(t). 

Since this linear system (8) has a fixed node at p, with fixed nodal tangents, 
it contains a subsystem («” ) with a triple point at p, which completes the 





446 ARTHUR B. COBLE 


proof of (7). It is, however, clear that the like linear system (*”~*) consisting 
of H(x"*', tu, ts, t” ”) = 0 also contains the system with triple point p, , 
whence 


(9) The linear system of adjoints with a triple point at p, has the equation 


H(x"™", tin, tra, t””) = 0. This system has base points at the points ty, tor 
of K(0). 
It is perhaps worth noting that, if Xx, --- , An-1 are the parameters of the 


system of adjoints nodal at p, , the three further conditions, linear in the \’s, 
that the adjoints have a triple point at p, yield three primes in the space [A] 
of a pencil. This is n — 2 conditions on the three primes which is precisely the 
number of conditions on p: “*(¢) that it have a perspective conic. It may be 
then that, if the property (7) appears at one node of p: **(¢), it must appear at 
every other node. This indeed is true in the first pertinent case n = 3. 

From (7) and (9) we deduce the following theorem toward which the argu- 
ment has been pointed: 


(10) At any node of po *8(#), such as pi, the lines (pipe), --- , (pypw,) are pro- 
jective to the tangents at p, to the curves C,,--- , Cw, . 

For, the products (pipe)-C2, +--+ , (pypy,)-Cy, are adjoints of order n + 1 
nodal at p;. Because of the existence of the system (9) with triple point at p; , 
the pairs of nodal tangents are pairs of an involution. This involutorial corre- 
spondence has, in the case of the above products, the corresponding pairs 
mentioned in the theorem. 

The property (10) translated to CW, mapped from P%,, states that the points 
m;; on the line /; and the primes yu ;; on the [n — 1], \; are projective, an obvious 
first requirement if CW,, is to admit a correlation and thus be intrinsically self- 
dual. We proceed to find this correlation which is a polarity in a quadric Q. 

Let the incidence condition of tangent t, of K(t) with point ¢ of p? **(t) be given, 

after factoring out (¢t,), by the equation 
(11) (kit:)(kt)"** = 0. 
This form in cogredient variables ¢, , ¢ has precisely the number, 2n + 2, of 
absolute constants required to determine uniquely p;*°(¢) with perspective 
K(t). For variable ¢; , (11) determines a pencil of collinear (n + 2)-points of 
> **(t). If t, t’ belong to the same member of this pencil, then 


p2 
(12) Q = (6t)"*(t’)"™* = (hiki)(kt)"P(R't’)" P(t’) = 0. 


We consider now the mapping of the plane of p; **(¢) upon the points of a White 
surface W,, in [nm + 1] and the allied mapping of collinear (n + 1)-points of p; **(é) 
upon the primes of a White envelope W, developed in the preceding section. 
The points ¢, of p:**(f) map into the norm-curve N"*'(t) on W,. Then Q in 
(12) can be interpreted as a quadric Q with apolar point pairs t, t/ on N"**(2). 
For given ¢ in Q = 0 the n + 1 points ?#’ are on a prime, the polar prime of ¢ 


as to Q, and since these n + 1 points ¢’ on p; **(t) are collinear, this polar prime 


‘ 








‘O- 


ve 








THE DOUBLE-N,, CONFIGURATION 447 


ison W,,. Since the coordinates of the prime are of degree n + 1 in ¢, these 
polar primes are those of a norm-curve N"*'(t) on W,. Thus Q = 0 is the in- 
cidence condition of prime t of N with point ¢’ of N. If the node py, of pz **(t) 
has parameters ty, , tr, the tangent t, of K(t) cuts p:**(t) again in points 


ton, T1,°** 5 Tn41, and the tangent t,, of K(t) cuts in points ty, , 8, ++ , Sn4i- 
Thus in Q = 0 we have, for t = t , then + 1 values t/ = 1, --+ ,fn41. Now 
the adjoint of order n + 1 on r,, +--+ ,7n41, being on p, also, must contain as 


a factor the tangent ty, of K(t), the residual factor being C,. Hence the cor- 
responding prime of N is a prime on the [n — 1], \, , with parameter t,. Thus 
L, of CW, , bisecant to N"*', corresponds under the polarity Q to \, on two 
primes of N"**. Hence 
(13) If p **(t) with nodal P\,, has a perspective conic, the nodular set Pz. of (9) 
of §2 is projective to Px,. The double-N,, configuration mapped from Py, is 
self-polar under a polarity Q which interchanges the line l, and the |[n — 1], a. 
Also Q interchanges the W,, on the points of the N,, lines | with W,, on the primes of 
the N, spaces \, and the N"** bisecant to the lines | with the N"** bisecant to the 
spaces X. 

This is the type of double-N, configuration which retains for generic n all 
the salient features of the double-six on a cubic surface. 


5. The possibility of configurations Cy, more general than CW,. At the 
outset we raised the question as to whether the configurations CW, are the 
most general of their type. If we ask for lines J, and spaces [n — 1], \, in [n + 1] 
such that J, and ), are not incident and that J, and d, (h # k) are incident, we 
would certainly not expect to find only solutions which are “lopsided”. This 
unfortunate lapse characterizes the CW, . For either the points m;; on l,, 
or the primes u;; on ;, define a set Py, , but not both, unless conditions are 
imposed on CW, . Clearly, then, there is a presumption that more general 
configurations Cy, exist which are symmetrical to the extent that neither the 
points m;;, nor the primes y;;, define a set P;,, and that the lopsidedness men- 
tioned is the result of conditions imposed on Cy, to make it a CW,. 

Room [5; 376, footnote] expresses the opinion that more general C\’s than 
those of CW; do not exist. He proves (p. 74) that, given in [n + 1] the 
three lines J, , ls , 1; and n + 2 of the [mn — 1]’s, namely, A; , Ao, As, Ag, *** 5 Anse 
with the proper incidences, a unique Cy, can be constructed to contain these 
spaces and that this isa CW,. Yet the restriction to three lines l, , lz , ls in- 
evitably suggests a 7(2, n, n + 1), which the argument confirms, and one is 
tempted to begin with a more symmetrical choice of the given elements, i.e., 
only pairs of opposite elements, J; , Ai; le, Awe} ++: . 


For cases n = 1 and n = 2 which are completely known we have the state- 
ments: 
n = 1. Given in [2], 2 opposite pairs with the proper incidences, there is 


2° = 1(; which contains the two given pairs. 





448 ARTHUR B. COBLE 


n = 2. Given in [3], 3 opposite pairs with the proper incidences, there are 
«'C's’s which contain the three given pairs. 
Thus one would certainly feel justified in the expectation that 
n = 3. Given in [4], 4 opposite pairs with the proper incidences, there are 
«*>'C.9’s which contain the four given pairs. 
If this expectation is correct the C\’s would be more general than the CW,’s. 
For a CW; has the 12 absolute constants of a Pi) whereas the given four opposite 
lines and planes of a Cy with the given incidences already have 12 absolute con- 
stants. The most plausible assumption is that k = 3, that it imposes three 
conditions to ask that the primes y;; define a Pip , and that it imposes three more 
to ask that the points m;; define a Pij. Then the intermediaie lopsidedness 
is removed and P?, , P13 are the nodes of two paired rational sextics. 

To see the implications of this conjecture, let us examine the cases n = 1 
and n = 2. The first is obvious, for the given lines /, , , on points A, , Az re- 
spectively determine \3 = (l,, 2) and lz = (Ai, A2). In the second case of the 
double-six of a cubic surface, let the lines 1; , 2 , ls; and the lines \,, Az, As be 
given with the incidence of 1; and \; when i # 7. We wish to determine three 
lines 1, , 1s , lg across \;, Ax, As, and three lines Ay, As , As across 1; , le , ls such 
that J, is incident with A», An (k, m, n = 4, 5, 6). Since the given lines involve 
only 18 constants and the double-six involves 19, we expect «' solutions. What 
we wish to emphasize is that one solution implies «' solutions. For, the lines 
across 1, , ls , ls are those of a regulus R,. The coordinates of such a line are 
quadratic in the parameter r. Similarly the lines across \; , \2, As belong to 
a regulus R, and have coordinates quadratic in ¢. The incidence condition of 
line r of R, and line ¢ of R, is the vanishing of a double binary form, f(¢’, 7°) = 
(ar)*(at)”? = 0. If, then, there is one double-six, this form has one closed con- 
figuration, 7 = 74, 75, 76, ¢ = t, ts, te such that f(t, 72) = 0 has roots t» , tn . 
But the existence of one such closed configuration for f implies the existence of 
x (see [3]). Thus the three given pairs 1; , \; define a poristic form, f(t’, 7°), 
and the configurations defined by this poristic form yield the ~' required C;’s. 

Naturally this well-known case admits of complete exploration. If (1; , d,) 
is the plane containing the incident lines /; , \;, there is a pencil of cubic sur- 
faces containing the first three given pairs, namely, 


(Lid2) (ters) (li) + (LAs) (1) (lsh2) = 0, 
the residual cubie curve of the base being composed of the three lines such as 
(v2) = (bd1) = 0. This pencil cuts R, and R; in projectively related triads 
of lines, 

(yt)® + k(st)® = 0, (cr)* + k(dr)® = 0. 
Thus we find the obviously poristic determinant form, 
_ | Grt)*@0)’ 

| (er)*(dr)’ 


3,3 


| 
Fr 
| ’ 








are 











THE DOUBLE-N, CONFIGURATION 449 


closed with respect to triads t , ts , 4s ; ts, 75, 76, any value of which determines 
k, and therefore the set of six values. But a generator ¢ = t of R, cuts R, in 
only two points, and therefore fails to meet one generator 7 of R,. Thus 
D;.3 factors into two forms, 


Ds,s - g(t’, r’) S(é, r), 


the first factor indicating non-incidence, the second, incidence of generators 
t, r, and the closure of D;3 implies the closure, or poristic character, of each 


factor. 
Part of the above argument in connection with a Cs, goes on to the Cy in a 
space [4] determined by four given line-plane pairs, l; , \:;--- ;l44, 4. The 


given pairs have 12 absolute constants so that there must be at least a finite 
number of even the CW;’s determined by them. Across the four planes 
Ai, -**, Aq there is a set of ©” lines, one line through each generic point of a 
particular plane. If a, b, c area fixed triangle on \; and aa + 2b + xc a generic 
point 2, the coordinates of the line through x across dz , A3 , Ay are cubic polyno- 
mials in x which vanish at the points x in which ), is met by \2, As, Aa. ~—Let 
us call this line/,. Similarly, if y = yo , y: , ye is a generic prime on /, , this prime 
contains a plane \, which crosses the four lines |; , lz, ls , 4. The coordinates 
of this plane are cubic polynomials in y which vanish for the three positions y 
determined by the primes containing J, and one of I, , J; , ly. The incidence 
condition of the line J, and the plane ), is f(a’, y*) = 0. The question we raise is 


° ° — ° —_— — . ° (5) (10) 
whether this form is poristic with infinitely many configurations x”, --- , x ms 
5) (10) 3 3 ~ ~ 
y,-+:,y such that f(x, y) = O for k,l = 5,---,10;k #1. Evi- 
dently each such configuration gives rise to six pairs J; , Xs ; «+ ; lio, Aw of line- 


planes which extend the four given pairs to make up a configuration Cy , and 
a poristic form would yield configurations which are not CW;’s. Since there is 
as yet no theory for the poristic double ternary form, we leave the problem at 
this point. 


BIBLIOGRAPHY 


1. A. B. CoBuie, Algebraic Geometry and Theta Functions, American Mathematical Society 
Colloquium Publications, vol. 10(1929). 
A. B. Cosue, Conditions on the nodes of a rational plane curve, this Journal, vol. 7(1940), 
pp. 396-410. 
3. A. B. Coste, Multiple binary forms with the closure property, American Journal of Mathe- 
matics, vol. 43(1921), pp. 1-19. 
A. B. Cosue, Trilinear forms, this Journal, vol. 7(1940), pp. 380-395. 
T. G. Room, The Geometry of Determinantal Loci, Cambridge University Press, London, 
1938. 
6. F. Scuur, Ueber die durch collineare Grundgebilde erzeugten Curven und Fléchen, Mathe- 
matische Annalen, vol. 18(1881), pp. 1-32. 
7. F. P. Wuire, On certain nets of plane curves, Proceedings of the Cambridge Philosophical 
Society, vol. 22(1923), pp. 1-10. 


to 
- 


or 


UNIVERSITY OF ILLINOIS. 





A PARTICULAR SET OF TEN POINTS IN SPACE 


By Artuur B. CoBLE 


1. Introduction. A generic trilinear form 7(2, 3, 4) = (ax)(By)(yz) with 
digredient variables zx, y, z which are contragredient to £, n, ¢ respectively in 
their spaces [2], [3], [4] depends upon 12 absolute constants. There are 10 
pairs z, y = pi, qi (¢ = 1, --- , 10) which are neutral for zin T = 0. In an 
earlier paper [4], it is proved that the set of ten points p; , Pj and the set of ten 
points g; , Qio are connected by the double identity in , 7, 

i=10 


(1) > (ain): (pie)? = 0. 


In this identity the set Pip is generic with 12 absolute constants. The set Qio 
is then projectively determined by the identity and thus is subject to three 
projective conditions. It is the purpose of this paper to determine the nature 
of these conditions, and so explore some of their consequences. 


2. The generic character of 9 points of Qj). In this section we prove that 
the three conditions on Qjp all fall on the tenth point when the first nine are 
given generically. This is somewhat unusual. For example, the ten nodes 
of a rational sextic are subject to three conditions and only eight can be chosen 
generically; in space the nine nodes of a symmetroid are subject to three con- 
ditions and only seven can be chosen generically. We observe first that the 
squares (pt)’ of points p in [2] represent a mapping of the plane [2] upon the 
points r of a Veronese V3 in [5], the point p; mapping into a point r;. Thus we 
have a set Ri) on V3, and the identity (1) asserts that the set Qjo is associated 
to the set R},. Hence 


7 eee 3 ° ° ° 5 
(1) The three conditions on Qi» appear in its associated set Rio as the three con- 
sad B 2 » r4 
ditions that Rio is on a Veronese V2. 


For, it is known [3; Theorem 18] that on nine generic points in [5] there are 
four V3’s, whence the three conditions on R}, fall on the tenth when the first nine 
are given. Let then r,, --~- , 79 be projected from ry into a set S3 = 81, °°, 8 
in [4], the V} on Rp projecting into a M} on S;. This Sj is associated to Q; = 
q,°***,4- Again it is known [3; Theorem 15] that on an Sj there are two 
M?’s, these being paired with the two reguli on the associated Q}. Since any 
M3 in [4] is the map of the plane by conies on a point, any Sj and M3 on it can 
be obtained by such a mapping from nine points of the plane. Since the above 
S$ is obtained from p, , --- , pp by the mapping with conics on py , the Sj is a 
generic set, and its associated set Q3 is also generic. Hence 


Received March 24, 1942, 
450 








n- 








PARTICULAR SET OF TEN POINTS IN SPACE 451 


(2) Any nine points of Qo constitute a generic set of nine points, and the two 
regult on the nine points are separated. 


3. The construction of Qj, when nine points are given. Let 2°, 2 be 
neutral for y in T = 0, ie., (ax)(yz)8; = 0. Then, for given x in [2], 
2 is on the Bordiga [1] surface F? in [4], the White surface W; of the preceding 
paper. For given z in [4], there is a point y in [3] which with z is neutral for x 
in T = 0, i.e., a:(yz)(By) = 0. However, for a point z on F?, these equations 
in y are linearly related, and y is on a line p” of the Semple [7] congruence. The 
properties of this correspondence between x in [2], 2 on F$ in [4], and line 
p” in [3] have been summarized in Room [6], and we apply them without deri- 
vation. To the directions p; about p;, there correspond the points 2” on a 
line l; of F} , and the lines p of a regulus R;(t) of the quadric B; on all of the 
points of Qi except g;. This immediately indicates a separation of the reguli 
on all ten of the quadrics B; into reguli R,(t) and R,(r). Again to the points 
«” of the cubic curve C; on all the points of Pip except p; there correspond the 
points 2” of a cubic curve k; on F? in a plane \;, and the generators p® of a 
cubic cone K; with vertex at g; and on the other points of Qip . 

From the obvious fact that the cubic curve C; has a direction at p; (i ¥ j) 
we conclude that 


(1) The cubic cone K; contains that generator of the regulus R(t) which passes 
through q;. 


From this and (2) of the preceding section, we obtain at once the following 
construction for Qio . 


(2) Given nine generic points q, , --+ , gg in [3] and the quadric By on them with 
regult Ry(t), Ri(r), the cubic cones K; (j = 1, ---, 9), with triple point at q;, 
on the remaining eight given points, and on the generator of Ry(t) through q;, 
all pass through the point quo of Qio . The cones K;, similarly defined for the 
regulus Ry(7) of By determine a point gio which with q: , «++ , dg make up a set Q:3 ‘ 


These theorems, (1) and (2), indicate how the separation of the reguli on the 
quadrics B,, --- , By of Qio is effected. 


4. A comparison of Qin with >}. It is clear from the preceding paper that 
the set Qi) under consideration, subject to three conditions, is half way along the 
path of specialization to the set D}o of nodes of a symmetroid, a set which is 
subject to six conditions. Some properties of Qjo associated with the system 
(«*) of Reye sextics on Qjp are found in §§5-8 of the article cited in [4]. We 
shall be concerned here more particularly with properties which it shares in whole 
or part with =}. One noteworthy property [5; 39] of Djo is that if any one 
of the quadrics B,, --- , Bw has a node, each of the others also has a node. 





452 ARTHUR B. COBLE 


Thus the discriminants A; , «++ , Ay of B,, «++ , Bw are identical. In contrast 
with this we have 

(1) If Qi is given, the radicals (A,)’, «++, (Aw)* each are rational functions of 
any one of them, and of the given Qio . 

This is a consequence of the construction (2) of the preceding section. For, 
if a value of (Ay)’ is given to isolate the regulus Ry(t) on By and thus to isolate 
qu from gio , there is equally well isolated a regulus R;() on B; (i = 1, --- , 9), 
and thus a value of (A,)'. 

There is, however, a considerable difference in the application of the con- 
struction (2) of §3 to Sj) and to Qi,. If qi, *** ,» G are subject to the three 
conditions that they belong to S}o, there is a pencil of cones K; and there is 
necessarily a member of this pencil on each of the generators of By through 
q;. Thus gq» and qi coincide (three conditions). If, however, qi, ---, q% 
of Qty are subject to the one condition that By have a node, and thus the reguli 
Ry(t), Ri(r) coincide, then gio appro: uc hes gio in some direction (one condition). 
It is not then necessarily true that g; approaches coincidence with q, for the 
given qo, ***, Go- 

Another interesting relation of the two sets Qjy of (2) of §3 is the following: 
(2) If Qio = M1, *** 5 9, Go and Q; =, °**, Q; qo are constructed as in 
(2) of §3, and if ~p., ---, po, po and py, «++, py, po are the ternary sets Pio, 
Pi2 related to them as in (1) of §1, then Pip , Pi} are the direct and inverse F- points 
of a Cremona transfor mation Ro be order 17 with 12-fold F-points at po, po and 
4-fold F-points at p;, p; (i = 1, «+: , 9). 

Under this transformation Ry , the directions at p) correspond to the P-curve de- 
fined at P;; by (0°7°)”, and the directions at p; to the P-curve (0°7°;')* (j ¥ 4) [2]. 
We recall from §2 that q:, --- , go are associated to Sj = s;, --- , 8 in [4] and 
that the plane of Pip is mapped by conics on Po into an M3 on Sj, the points 
Pi, ***, Po Mapping into s,;, ---, s. If Pj is the set of inverse F-points 
of Ry with F-points at Pj), and if also p;,, ---, pi, are on a conic with po, 
then Pi, ee Di, are on a conic with Do (t1 _— is = Tee , 9). Hence, 
conics on py map the plane of P?2 into an M3 on 8S,‘ with S$ } ane S: so related 
that if five points of S} are on a prime, the like-named points of S5‘ are also on 
a prime, and vice versa. Hence, S} and S;‘ are projective and may be super- 
posed. Since lines on po , which map into generators of M3, do not pass by Ro 
into lines on po , the M;° on S,‘ is not superposed on the M3 on Sj. Thus the 
two planar sets are the two projectively distinct sets which determine Qi , 
Q;3 respectively. 

In case Qi) is a Sho, the two sets Qio, Q3 coincide throughout. However, 
there are still two projectively distinct sets Po, P33 related as in the theorem, 
and related as above to the two M3’s on Sj, or to the two reguli on the quadric 
By. These are now the two sets of nodes of two “paired’’ rational sextics 


(2; 252, (4)]. 











rast 


s of 


For, 
late 
9), 


on- 
ree 
e is 
ugh 
» W9 
ruli 
yn). 
the 


ing: 


; on 








PARTICULAR SET OF TEN POINTS IN SPACE 453 


- e,° 2 12 ° ° e ° 
Conditions on Pio , Pio equivalent to those given in (2) can be expressed in 
the following simpler form. 


(3) The quadratic transformation Aoj;,i, with F-points at po, pi, , Pi, transforms 
the remaining seven points of Pio into a set Rj projective to the set R;’ into which 
the remaining seven points of Pi? are carried by Abinis : 

For, the projection of Sj from s;, , s;, yields a set R7 which is associated to 
the set Q} obtained from q , --- , g by deleting Gi,» Gi, - But two sets R; , R? 
each associated to Q? are projective to each other. This theorem (3) also 
applies to the case of the nodes of two paired rational sextics and then is equiva- 
lent to the statement given elsewhere (2; 253, (7)]. 

We prove finally that the conditions on Qj) are invariant under properly 
chosen Cremona transformation, the precise statement being: 


(4) If Qic is related to Pio as in (1) of §1, and if Qio is congruent to Qs under the 
regular cubic transformation Ajo, and Po is congruent to P;i under the quintic 
transformation A sezs90 , then Q:3 is still related to P? as in (1) of §1. 


For, if P{2 is congruent to Pi) under Asezs , the set R’) on V3 obtained from 
P%» is congruent to a set Ris on V2‘ obtained from R’, by the regular trans- 
formation in [5] with F-points at rs, ---, re, ro [3; 16]. Then their asso- 
ciated sets are congruent under Ayo . 

This is the theorem [2; 257, (14)] concerning Qi) = D% and Pip = nodes of 
p2(t). The difference in the two cases is that the number of projectively dis- 
tinct congruent sets is finite in the case of Djo , and infinite in the case of Qio. 


BIBLIOGRAPHY 


1. G. Borpiea, La superficie del 6° ordine con 10 rette, nello spazio R, e le sue projezione nello 
spazio ordinario, Atti R. Acad. Lincei (Mem.), vol. 4(1887), pp. 182-203. 
. A. B. Coss, Algebraic Geometry and Theta Functions, Colloquium Publications, Ameri- 
can Mathematical Society, vol. 10(1929), pp. 252-254. 
3. A. B. Cosie, Associated sets of points, Transactions of the American Mathematical 
Society, vol. 24(1922), pp. 1-20. 
. A. B. Cosxe, Trilinear forms, this Journal, vol. 7(1940), pp. 380-395. 
. J. R. Conner, The rational sextic curve, and the Cayley symmetroid, American Journal of 
Mathematies, vol. 37(1915), pp. 29-42. 
6. T. G. Room, The Geometry of Determinantal Loci, Cambridge University Press, London, 
1938, pp. 384-385. 
7. J. G. Sempie, On representations of line-congruences of the second and third orders, Pro- 
ceedings of the London Mathematical Society (2), vol. 35(1933), pp. 294-324. 


to 


or 


UNIVERSITY OF ILLINOIS. 








CONTENTS 


The asymptotic forms of the solutions of an ordinary linear matric dif- 
ferential equation in the complex domajn. By Homer E. News t, Jr. 245 
A self-reciprocal function. By R.S. Varma 
The structure of the group of $-adic l-units. By Davin GitBarG 
An explicit formula for the solution of the ultrahyperbolic equation in four 
independent variables. By GuynN OwENns 
Generalized arithmetic. By Garrett BirKHorr 
Maximal fields with valuations. By Irvine KaPpLANskY 
Algebraic properties of certain matrices over a ring. By Neat H. McCoy.. 322 
Central chains of ideals in an associative ring. By S. A. Jennines 
Generalized “sandwich” theorems. By A. H. Stonz and J. W. Tuxey.... 
The continued fraction as a sequence of linear transformations. 
By J. Frvptay Paypon and H.8. Watt 360 
Some properties of summability. By J. D. Huu 
A general equation for relaxation oscillations. 
By Norman Levinson and Oxrver K. Smira 382 
The divergence of non-harmonic gap series. By Paiiip Hartman 
Influence of the signs of the derivatives of a function on its analytic char- 
acter. By R. P. Boas, Jr. and G, Pétya 
The distribution of primes. By AurgEt WINTNER 
Parametric solutions of certain Diophantine equations. By E.T. Brt.... 
The double-N, configuration. By Artaur B. Coste 
A particular set of ten points in space. By Artuur B. Coste 





CONSERVATION OF SCHOLARLY JOURNALS 








The American Library Association created this last year the Committee on 
Aid to Libraries in War Areas, headed by John R. Russell, the Librarian of the) 
University of Rochester. The Committee is faced with numerous serious: 
problems and hopes that American scholars and scientists will be of conside’ 
aid in the solution of one of these problems. 


One of the most difficult tasks in library reconstruction after the 
World War was that of completing foreign institutional sets of American sc! 
arly, scientific, and technical periodicals. The attempt to avoid a duplicat 
of that situation is now the concern of the Committee. 


Many sets of journals will be broken by the financial inability of the in- 
stitutions to renew subscriptions. As far as possible they will be complet 
from a stock of periodicals being purchased by the Committee, Many more 
have been broken through mail difficulties and loss of shipments, while 
other sets will have disappeared im the destruction of libraries. The size 


the eventual demand is impossible to estimate, but requests received by ¢ 
Committee already give evidence that it will be enormous. 


With an imminent paper shortage attempts are being made to collect ¢ 
periodicals for pulp. Fearing this possible reduction in the already limited 
supply of scholarly and scientific journals, the Committee hopes to enlist the 
cooperation of subscribers to this journal in preventing the sacrifice of this 
type of materia] to the pulp demand. It is scarcely necessary to mention ¢ 
appreciation of foreign institutions and scholars for this activity. 


Questions concerning the project or concerning the value of ¥ 
periodicals te the project should be directed to Wayne M. Hartwell, Executive 
Assistant to the Committee on Aid to Libraries in War Areas, Rush Rhees 
Library, University of Rochester, Rochester, New York. 


(FOR CONTENTS, SEE INSIDE BACK COVER.) 





A, 





